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Preface 


Effective Theories have been with us since the dawn of science, but it has only 
been in recent decades that we have found it important enough to give it a clear 
and voiced name. This new found desire is due in part to our understanding that no 
finitely written theory is complete. There was a proselytizing impulse among all 
those who first grasped the vision of Effective Theories. I recall as a Ph.D. student 
that many fellow students coming out of Boston would repeatedly pepper their 
conference talks with the words “Effective Theory”, and others sometime joked 
that they were curiously keen on celebrating their ignorance and lamented how sad 
it was that they had such weak ambition. It was at the time that many particle 
physicists were proudly espousing their faith in the “Theory of Everything” being 
around the corner. The extremes of the two camps were in stark contrast. 

Today, the culture and language of Effective Theories have permeated all of 
physics. It is not controversial and not lamentable. The concepts are deeply 
ingrained in many other areas of theoretical physics. In the subsequent chapters, 
several different physics subareas are touched upon but the discussions all revolve 
around Effective Theories. An abstract definition of the term is given in the first 
chapter, and fleshed out through examples in the following chapters. It is hoped 
that by the end the reader will have a good feel for how the concepts of Effective 
Theories affect the thinking of practicing scientists, and can see the power that 
explicitly agreeing to the Effective Theory mindset can have in developing richer 
theories of nature and achieving a deeper understanding. 


Overview of Subjects Covered 


In the following chapters, I wish to emphasize various aspects of Effective The- 
ories across various subdisciplines of physics. Chapter 2 discusses the harmonic 
oscillator from an Effective Theory point of view. The harmonic oscillator is one 
of the most important models of physics, and shows up in many guises across all 
subdisciplines. For this reason I have chosen to start there. The chapter is 
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somewhat allegorical as I go through the story of coming upon a harmonic 
oscillator system and trying to understand what theory may describe it. The 
concepts of Effective Theories, and the traps that people may fall into if they do 
not accept that theories are never complete, are illustrated at each step of the 
discovery process. 

In Chap. 3, I emphasize how blinded we can be to progress if we do not 
understand that all theories are Effective Theories. I use the example of Newton’ s 
law of gravity, and argue that if scientists had the more modern perspective of 
Effective Theories, they would have not only been quite sure that an anomalous 
perihelion precession of Mercury would one day be discovered, but they would 
also have been able to predict roughly what size it would be. As it was the only 
anomalous precession admitted to the canon after very painstaking experiment and 
the exhaustion of all other explanations based on mundane effects were analyzed. 
Reluctantly, the anomaly was accepted and Einstein’s theory of gravity ultimately 
legitimized it. 

In no other area of science has Effective Theories played such a prominent role 
as in elementary particle physics. In Chaps. 4 and 5 I focus on this subfield of 
science. In Chap. 4, I give a brief introduction to the history of Effective Theories 
in particle physics before coming to the main theme of Effective Theories and the 
Higgs boson. The Higgs boson is the elementary scalar particle that is said to give 
mass to all other known elementary particles. It achieves this by spontaneous 
symmetry breaking, a concept that will be discussed in some detail. However, the 
compatibility of Effective Theory ideas and the Higgs boson spontaneous sym- 
metry breaking scale is under dispute. The main purpose of Chap. 4 is to enable 
the reader to understand what this dispute is and to give various ideas that resolve 
the dispute. Unlike other chapters, this one contains advanced material that one 
normally does not encounter until graduate studies. The material is there partly to 
emphasize to the reader that there is no way to speak intelligibly about the subject 
without that advanced material. Those who already know the background core 
material may wish to skip directly to Sects. 4.4 and 4.5 where the focused dis- 
cussion on the role of Effective Theories is presented. 

Finally, in Chap. 5, I show that the concepts of Effective Theory can play an 
important role in our theory choice activities. The goal of this chapter is to show 
the culture of theory choice among practicing particle physicists, which is most 
often not talked about openly among the physicists, and then to describe how the 
ideas of Effective Theories can change perceptions of what the “Best Theories” 
are. 
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Chapter 1 
The Utility of Effective Theories 


1.1 Definition of Effective Theories and Their 
Purpose 


“Effective Theories” are theories because they are able to organize phenomena under 
an efficient set of principles, and they are effective because it is not impossibly com- 
plex to compute outcomes. The only way a theory can be effective is if itis manifestly 
incomplete. “Everything affects anything” is generally correct, but it saps confidence 
in our ability to predict outcomes. Effective Theories modify this depressing maxim 
by pointing out that “most things are irrelevant for all practical purposes.” A tree 
falling in Peru does not appreciably affect a canon ball’s flight in Australia. Any 
good Effective Theory systematizes what is irrelevant for the purposes at hand. In 
short, an Effective Theory enables a useful prediction with a finite number of input 
parameters. 

With this definition of Effective Theories it appears that all theories are such, and 
thus giving it a fancy capitalized name is pointless pedantry. However, the proper 
name is useful to repeat at times as a reminder that the prominent views of science 
were not always agreeing that theories were necessarily incomplete, and as areminder 
to go beyond it when and if the circumstances may arise. Furthermore, the natural 
tendency of young students entering science is to believe a theory is either right 
or useless, when they can never be completely right, but rather merely Effective 
Theories that are “correct enough for our purposes in this domain.” Frequent and 
formalized reminders of this are helpful for newcomers to the field. 

The other purpose of emphasizing the name Effective Theories is to force us 
to confront a theory’s flaws, its incompleteness, and its domain of applicability as 
an integral part of the theory enterprise. The most useful Effective Theories are 
ones where we know well their domains of applicability, and can parametrically 
assess the uncertainties induced by ignoring the “irrelevant.” They may even have 
a well-defined procedure for becoming more and more complex as one wishes to 
compute to higher accuracies. This is the case in many Effective Field Theories 
of particle physics, such as pion scattering or even graviton scattering. There is 
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a science in understanding the circumstances of when questions can be addressed 
using accurate, convenient Effective Theories, and it is generally acknowledged that 
scale separation (Hillerbrand 2013) is one important feature of systems that enable 
an Effective Theory to separate out well the “relevant” from the “irrelevant”. Indeed 
the phrase “irrelevant operator” is a technical term used in particle physics (Cohen 
1993) to identify small contributions to phenomena caused by dynamics at a much 
different energy scale than is being probed. This issue arises in one form or another 
in all Effective Theories and will be seen in the examples presented. 


1.2 Galileo’s Law of Falling Bodies as an Effective 
Theory 


Throughout this book we will get progressively more modern in our discussion of 
how to apply the concepts of Effective Theories to physics. We will move from 
the harmonic oscillator to Newton to Einstein to Fermi to Higgs and others. Before 
we do that, let us begin in this introductory chapter with Galileo—one of the first 
scientists who had what is recognizable as a modern perspective to scientific thought. 
Galileo was dedicated to knowing what was correct with less care about his or others’ 
preconceived ideas. He was dedicated to experimental verification as an unbiased 
arbiter of theories. He investigated many things, but we will focus on his theory of 
falling bodies, and within that context show, as a warm-up to more sophisticated 
theories later, how the concepts of Effective Theory could have engendered further 
insight into a more general theory of gravity beyond just describing a falling body. 

Let us suppose that we are back in the day of Galileo, well before Newton came 
along, and we are very mathematically sophisticated for the times. Upon reading 
Galileo’s book the Two Sciences we come across the following passage: 


When, therefore, I observe a stone initially at rest falling from an elevated position and 
continually acquiring new increments of speed, why should I not believe that such increases 
take place in a manner which is exceedingly simple and rather obvious to everybody? If now 
we examine the matter carefully we find no addition or increment more simple than that which 
repeats itself always in the same manner. This we readily understand when we consider the 
intimate relationship between time and motion; for just as uniformity of motion is defined 
by and conceived through equal times and equal spaces (thus we call a motion uniform 
when equal distances are traversed during equal time-intervals), so also we may, in a similar 
manner, through equal time-intervals, conceive additions of speed as taking place without 
complication; thus we may picture to our mind a motion as uniformly and continuously 
accelerated when, during any equal intervals of time whatever, equal increments of speed 
are given to it.... And thus, it seems, we shall not be far wrong if we put the increment of 
speed as proportional to the increment of time; hence the definition of motion which we are 
about to discuss may be stated as follows: A motion is said to be uniformly accelerated, 
when starting from rest, it acquires, during equal time-intervals, equal increments of speed 
(Galileo 1638). 


In mathematical language Galileo is saying ôv = gôt, where v is the speed and g is 
the constant of proportionality. In differential calculus language dv, ôt — dv, dt. 
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Bringing dt to the other side of the equation one can rewrite Galileo’s Law as 
dv/dt = g. But change in velocity with respect to time is nothing other than 
the acceleration, and Galileo’s law becomes a = g, which is “uniform acceleration” 
as Galileo himself called it. Notice that the mass of the stone falling is not in this 
equation. More on that later. Another way to write the above equation is 


Z = —g (Galileo’s Law of Falling Bodies), (1.1) 


in the convention that z is the position of the ball with increasing z in the opposite 
direction of the acceleration vector. 

As an aside, every first year physics student has computed the trajectory of a 
ball in a uniform gravitational field. The equation of motion is usually derived from 
Newton’s Second Law of Motion F = ma. In this case the force is —mg where 
g = 9.8 m/s is the acceleration downward due to gravity on the Earth’s surface, and 
a = Z is the second time derivative of the ball’s motion—the actual acceleration of 
its trajectory. The equation of motion is then Z = —g, which is exactly Galileo’s 
Law. Despite everyone knowing this, the reader is here requested to forget the more 
sophisticated later era of Newton, where this particular equation Z = —g is a simple 
derivation of a deeper law. Instead, I would like to ask the reader to treat Z = —g asa 
law of nature that has no parent—it is something stand-alone discovered by Galileo. 
That is why I am giving it a fancy name: “Galileo’s Law of Falling Bodies”, or GLFB 
for short. Let us press forward with GLFB, and ask what Effective Theories may say 
about it. 

To give us something concrete to talk about with regard to GLFB, let us compute 
the time it takes for a body at rest to drop from a height h. The position of the body 
as a function of time is 


1 2 
z(= h= 580. (1.2) 


Falling a distance h then takes time T = ./2h/g. Notice, this does not depend 
on the mass of the body—an interesting conclusion that Galileo understood well. 
He knew that air friction caused bodies to slow down, and he even understood the 
concept of terminal velocity,! but most impressively he realized that air friction was 
a complication that was not fundamental to the problem: 

Now seeing how great is the resistance which the air offers to the slight momentum [momento] 

of the bladder and how small that which it offers to the large weight [peso] of the lead, I am 

convinced that, if the medium were entirely removed, the advantage received by the bladder 


would be so great and that coming to the lead so small that their speeds would be equalized 
(Galileo 1638). 


In other words, in the limit that the density of the body was much higher than the 
density of the air, the air friction was not important. Galileo repeated this principle 


1 « there is no sphere so large ... or so dense... that the resistance of the medium, although very 
slight, would check its acceleration and would, in time reduce its motion to uniformity” (Galileo 


1638). 
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in other places, and understood it well: the fundamental law of falling bodies with 
resistance-less medium is uniform acceleration. 

Another demonstration of Galileo’s genius was that he understood better than 
anyone at that time that scientific claims were not only about deep thoughts that 
sounded good, but required experiment to test them and that any result was subject 
to question. At one point he took a swipe at Aristotle for holding what Galileo thought 
was an unjustified opinion: “... I greatly doubt that Aristotle ever tested by experiment 
whether it be true ...” (Galileo 1638). Galileo was certainly no respecter of persons, 
but rather had unswerving loyalty to determining what was correct. Even when he 
introduced his theory of falling bodies he qualified it by saying, “we shall not be far 
wrong” if we agree to his theory. Tentativeness, testing and refinement, the hallmarks 
of science, were important to his approach. 

Galileo surely would not have minded any correction to his law that was not in 
conflict with what appeared to be sacrosanct symmetries of nature, such as invari- 
ance under rotations and space and time translations (Arnold 1989). A correction 
that seems quite reasonable is to disrupt uniform acceleration slightly by adding a 
correction term that depends on height position z.” Thus let us add the correction 
Z = —g + cz, where c is some “small” constant. 

The constant c is unknown and so this theory is not very predictive. However, we 
can make some intelligent guesses of roughly what value it could take. For one, we 
know that somehow we have to make cz have units of acceleration. This requires c 
to have units of acceleration/length. This is an awkward set of units. However we 
can simplify it by utilizing the one and only constant of our original theory, which is 
g and has units of acceleration. Thus, the obvious thing to do is let c —> g/R, where 
R is some unknown fixed constant of length. What could R possibly be? The test 
bodies are being pulled to earth, and they are all being pulled with (nearly) uniform 
acceleration independent of the size of the test body,° and so it is very reasonably to 
assume that we need to look to the Earth to provide us with a “natural length scale” 
to assign R. The radius of the Earth, Re = 6400 km, is the obvious candidate.* 

If we were dogmatic and very arrogant we would say that our choices were 
“obvious” and that this new law, the Adjusted GLFB (ALFB), is the correct first 
correction and write Z = —g(1 — z/Re) and then start computing. However, let us 
be humble scientists and suggest that this correction is perhaps “not far wrong”, as 
Galileo might say, and insert a “constant of tentativeness” 7, which is dimensionless 


? This is not in conflict with Galilean translation invariance, as z is shorthand for a difference in 
position of the body with respect to the earth’s surface z = r — Rearth. 


3 Furthermore, using the size of a small test body as the parameter R would lead to dramatically 
too large effects, and for that reason also it can be dismissed as an option. 


4 There are several other length scales that perhaps might be equally justified, including the cir- 
cumference of the earth (R = 40,000 km), the height of the tallest mountain (R = 9 km), or the 
depth of the deepest sea (R = 11 km). The latter two are perhaps less intuitively relevant and could 
be dismissed as serious candidates. Nevertheless, if one kept an open mind to them all, the length 
scales are all within about a factor of 10° of each other, which might appear disastrously large to 
estimate a correction term, but it is decidedly better than not knowing how to estimate within a 
factor of oo. 
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and perhaps not far from 1. Our new ALFB can be written as 
Z= -g (1 — n= +- -) (Adjusted Galileo’s Law of Falling Bodies). (1.3) 
e 


Writing theories down with extra terms that have “natural sizes” and are consistent 
with symmetries is a cornerstone of the Effective Theory approach. This example is 
intended to demonstrate that a new theory can be generated by having this mindset, 
and the new theory is more correct, even if a little less predictive. 

Ignoring the higher order “- - -” terms, the solution to the problem of position as 
a function of time now becomes 


z(t) =h cosh ( n= r) = Le (1.4) 


and the time it takes to reach z = 0 is 


2h h h? 
T= F (1+35+0(%)). (1.5) 


A body dropped from 200m takes about a tenth of a second longer according to 
ALFB with 7 = | compared to the 6.5s predicted by the GLFB. 

In an alternative scientific history this effect of longer dropping time could have 
been measured and the anomaly noted before Newton’s theory of gravity was deci- 
sively understood. The measurements would have converged on 7 = 2 to within 
experimental uncertainties. A discrepancy with Galileo’s pure GLFB would not have 
been the subject of deep worries about human’s ability to understand the laws of 
the universe since Galileo himself was tentative about his law. In time, Newton’s 
theory would then develop, and the value of 7 would be computed to be exactly 
2, and Newton’s law of gravity would then replace GLFB as the overarching the- 
oretical framework by which to understand and compute the trajectories of falling 
bodies. 

We have seen from this simple example that one does not need to know the 
more fundamental theory of Newtonian gravity to anticipate corrections, compute 
their effects, and compare with data. The Effective Theory of ALFB is better than 
Galileo’s original law, despite being less predictive, because ultimately it can accom- 
modate the data better and reflects Newton’s deeper theory. We will see another 
example of this in the chain of theories in a later chapter that shows how one 
could have anticipated phenomenological implications of Einstein’s General Rel- 
ativity by taking a more tentative, Effective Theory approach to Newton’s Law 
of Gravity. 
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Chapter 2 
Harmonic Oscillator as an Effective Theory 


Abstract The concepts of Effective Theories are illustrated allegorically within the 
context of one of the most ubiquitous models of oscillating physical phenomena—the 
harmonic oscillator. 


2.1 Basics of the Harmonic Oscillator 


The concepts and issues related to effective theories can be illustrated quite nicely 
by the harmonic oscillator problem. The harmonic oscillator is one of the most 
ubiquitous mathematical models of physics phenomena. It is present in almost every 
system with a restoring force, which includes the galaxy, solar system, springs, atoms, 
molecules, and innumerable other configurations. 

The main point I would like to illustrate is that the lowest order effective potential 
for the harmonic oscillator is an excellent approximation to the motion of a system 
over a wide range of amplitudes. However, at some point it breaks down when 
the amplitude is large enough, and then control over the system is lost unless a 
deeper theory is understood. We shall not go into the construction of deeper theories 
in this chapter, but rather focus on the domain of applicability of the harmonic 
oscillator effective theory, and show how small corrections can be anticipated and 
then measured by precise experiments to start building a more complete picture of 
the potential governing the system. 

To keep the illustration simple, we will restrict ourselves to one-dimensional 
harmonic motion of a particle subject to the restoring potential V(x) = kx?/2. The 
Lagrangian of the system is then 


x? x? 
tafa (r55) (2.1) 
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From the principle of least action the equation of motion gives Newton’s second law 
of motion F = ma the form 


mx = —kx => mž + kx = 0. (2.2) 
Defining w* = k/m, we can rewrite this as 
X+ ox =0 (2.3) 


which has the solution 
x(t) = Asin(@t) (2.4) 


where A is the amplitude, and the boundary condition of x = 0 at t = 0 is enforced. 
Let us review a few basic facts about the harmonic oscillator solution. The period is 


2 | 
Tperiod = = = 20 ap (2.5) 
w k 


The amplitude A of motion is related to the initial velocity by equating full potential 
energy at maximum amplitude to the full kinetic energy at maximum velocity: 


1 l T veri 
Snax = ska? a Ae in [> z ma = ma s (2.6) 


It should also be noted that the period of the harmonic motion is not dependent on 
the amplitude of the motion. This is clear from Eq.2.5 where it is shown that the 
period only depends on the input parameters m and k. The amplitude and maximum 
velocity conspire with each other such that vmax/A is always equal to ./k/m. 


2.2 Ubiquity of the Harmonic Oscillator 


The harmonic oscillator problem is ubiquitous in physics, describing small motions 
of an object attached to a string, molecules vibrating in crystals, electrical circuit 
response, etc. There is a straightforward reason why there are so many examples that 
follow simple harmonic behavior. Let us suppose that the equilibrium point (i.e., the 
minimum of the potential) is about the origin. Then, the potential for motion is a 
power series of the form 


V(x)= ax? + a3x? + agx +. (2.7) 


We do not write down a constant term or a term linear in x because the first is 
irrelevant and the second term cannot be present if x = 0 is a local minimum. If it 
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is present, one shifts x to cancel it, which is the place of the new extremum.! There 
are an infinite number of potentials that can be written down, with various relative 
weightings of x4, x!*, etc. The motions of a particle or entity about the equilibrium 
can be very different depending on the potential. 

Nevertheless, the universal quality of harmonic motion is ubiquitous because at 
values of x below some critical value xerit the potential is always dominated by the 
x? term. For example, in comparing the a2x* term to the a3x° term, the ratio is 


ax? al a 


7= —- => ax’ term dominates over a3x? when x < xerit = —. (2.8) 
a43 X a3 


a3Xx 


In other words, small enough amplitudes are always very well described by simple 
harmonic motion in a x? potential. 

In the following we will investigate an abstract system that has harmonic oscilla- 
tion in the “low-energy limit”, when the amplitude is small. We shall see that through a 
combination of precision measurements and venturing into the high-energy unknown 
we can learn more about the system. In the course of these investigations I wish to 
give a sense of the usefulness of thinking in terms of effective theories, as well as 
seeing the limitations of it. 


2.3 First Theory 


Let us suppose that there exists a System? that appears to be undergoing harmonic 
oscillation. For simplicity, the System will be chosen to have lengths of amplitude 
and times for the period of motion to be measured most conveniently in meters and 
seconds; however, this is only for intuitive concreteness, and one can multiply these 
units by orders of magnitude in any direction as appropriate for different systems. 

In the earliest stages of investigation of the System we see that it is undergoing 
oscillatory behavior with a period of about 10s. The resolution of the instrumentation 
is not good enough to resolve any deviations from pure harmonic motion, and so we 
posit that the motion is governed by the potential 


2 
V(x) = = — > ¥+o%x=0 (Theory 1). (2.9) 


Let us now suppose that we try to test this theory by precision measurements. 
Again, at the early stages of experimenting on a system, the resolution may not be 
so good. Let us suppose that is the case for our simple System, and assume that the 
period is measured to be 


! If for some reason a7 = 0, then a3 will need to be zero also, otherwise x = 0 is not a local 
minimum, and the first term to worry about is x*. This is a complication that we need not worry 
about for now. 


2 We capitalize System to give it a reference name for rest of the discussion. 
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Tperiod = 1080.38 (Measurement 1). (2.10) 


This period of motion can be accommodated by our theory as long as 


o = 0.63 £0.02 s7? (Parameter Fit 1). (2.11) 


It is no mystery that we could find a value of œw that fit the period. No matter how well 
we measure the period, it is only one observable and the theory has one parameter 
that can always be adjusted to match it. We need more observables to test the validity 
of the theory more fully. 


2.4 Second Theory 


Another drawback of having just one observable is that there are an infinite number 
of theories that we could write down trivially whose parameters could be adjusted in 
an infinite continuum of values to accommodate the measurement. One such theory 
has the same potential as Theory 1 except for now we add an x? correction term to 
the potential, 


V(x) Sá c= = N EE 0 (Theory 2) (2.12) 
x)= K— == Xx FiK — = eor f 
2 34A gi AA , 


where w4 and A4, a new length scale, are two parameters that can have a relation 
between them that give the same period. Here are two values: 


@a = 0.6387 and Aa = 00 (2.13) 
wa = 0.631 s7? and Aa =250m (Parameter Fit 2) (2.14) 


where the first line is equivalent to Theory 1 and the second line is just one parameter 
fit out of an infinite number of possibilities. 

Upon close inspection of Theory 2 we notice that the correction term always 
generates a force of the same direction no matter what the value of x: it pushes the 
particle away from the origin when x is negative and pulls it back to the origin when 
x > 0, whereas the first term always is restoring. This should create an asymmetry 
in the time it takes for the Particle to cross x = 0 half-way through its full periodic 
motion compared to the time it takes to cross x = 0 again on its second half of 
the motion. We can compute this difference in time. Even though the total period 
Tperiod = 10s stays the same, the first and second halves of the distance covered by 
the motion would be asymmetric if x/A, is not too suppressed: 


-1/2 


1/2 —1/2 1/2 
Cowl. 7 Ot eet et ae (2.15) 


perio period 
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Therefore, an important additional observable to measure are these “half periods” to 
see if they are antisymmetric as Theory 2 predicts. 
Let us now suppose that there are improvements in the experimental instrumenta- 


i “ ngn 1/2 —1/2 : 
tion such that we can measure each “half period”, T u od and Ta $ H od? and it can be 
done to accuracies of 0.01 s. And let us suppose that after some time of measurement 


it is determined that 


TH = 5.058, Toi, =5.06s, and 


period perio 


Tperiod = 10.11£0.01s (Measurements 2). (2.16) 


To within the error bars of 0.01 s the two period halves are equal. 

The usual scientific approach to the present situation is to say that the simpler 
model wins out if it accommodates the data as well as the more complicated theory. 
Thus, the community of scholars faced with the measurements above may well 
conclude that Theory 1 is correct, or conclude that even if the x/A, term is present 
it is so suppressed that it is immaterial to the physics. 

As we shall discuss later, this is the kind of statement that one might find in particle 
physics when considering higher dimensional operators of Standard Model particles. 
As in particle physics we may hold firm to the idea that there is no reason why these 
extra terms should not exist. Indeed, in an effective theory the full series expansion 
of additional terms should exist. But we must acknowledge that their coefficients 
may be too small to discern from our experiments. 


2.5 Fancy Explanations 


Not seeing the effects of the asymmetric x/A, term after greatly improving the 
experimental situation to look for it would likely get the community thinking hard 
for the reasons of that failure. As we already mentioned, the diehard believers would 
just say that A, has a value just higher than the experimental sensitivities would see. 
Others would invent reasons for why x/A,4 should never have been there in the first 
place. These reasons need to be based on some kind of symmetry argument. 

There are two straight-forward symmetry arguments that would banish the x/A,4 
correction to the potential. The first argument is to presume that the potential has an 
x —> —x discrete symmetry. This would banish all odd corrections that could give 
rise to asymmetric half periods. Our next correction would then be x*/A*. We will 
investigate the experimental consequences of that potential shortly. 

Another symmetry argument that says the harmonic oscillator lagrangian is exact 
with a conformal symmetry, x — Ax where à is some arbitrary scaling parameter. 
Although the Lagrangian is not invariant under this, the equations of motion are. It is 
this scaling symmetry that tells us that time observables are independent of the spatial 
scaling. In other words, the (time) period is independent of the (spatial) amplitude. 
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There is a temptation of smart people to promote the most sophisticated and 
fancy arguments to explain the phenomena. It is not very sophisticated to say “the 
additional terms are too small to see”. But it is fancy to say things like “conformal 
symmetry” and “discrete symmetry.” And if the experimental situation languishes 
long enough theorists can become even more sophisticated with their description of 
why these terms must be banished, and look down upon people who do not catch the 
fever of fancy explanations. And if it goes on even longer it will be so entrenched in 
the highest schools of the land, that few will want to challenge it by proposing ways 
to find evidence for non-fancy corrections to the spatial scale-invariant theory. 


2.6 Third Theory 


Nevertheless, let us suppose that we take courage and wish to press forward in 
testing Theory | yet again. Odd corrections may exist, but we may need orders of 
magnitude more precision to see evidence for Do dÉ ele We may have more 
luck introducing only even power corrections to the potential. So we shall do this by 


introducing 


Oe es ¢+o3x(1-2-)=0 cheoy3) @.17 
ye z 247 => ï + wRx a i (Theory 3) (2.17) 

What can we do to test and try to strain the theory? We know that measuring 
the half-periods does no good. However, being excellent students of the prevailing 
scale-invariant idea, we know that the period should not change depending on the 
amplitude. We need to find a way to perturb the system to increase the amplitude and 
see if the period changes.* 

Let us suppose in our system that the particle passes through the origin with 
velocity of 10 m/s. Changing it requires significant technical skill, but we find a way 
to do it. We increase the energy into the system and obtain a new initial velocity of 
15 m/s, which increases the amplitude by approximately 50 %. Upon measuring the 
period we get 


Tperiod = 10.258 + 0.01s (Measurement 3) (2.18) 


which differs by many standard deviations from the 10.11s value obtained when 
Vinitial = 10 m/s, and is a clear signal for breaking of the spatial scale invariance of 


3 Itis here I would like to remind the reader again that this is a fanciful allegory to how experiment and 
theory interplay on the effective theory stage, and although a simple macroscopic harmonic motion 
system can be manipulated and measured in all sorts of ways with ease, sometimes other systems 
are significantly more challenging to do the analogy of measuring half periods or of increasing the 
amplitudes. 


2.6 Third Theory 13 


the equations of motion. This is the first firm proof that the exact harmonic motion 
law of V(x) œ x? is not fully respected. 

We are likely to be quite excited about this, because we posited a theory that said 
there should be violations of scale invariance when the amplitude grows. And now 
that we see it we want to fit the parameters. Here is one such choice that works well 


wp = 0.63 s7? and Ag =95m (Parameter Fit 3). (2.19) 


The two measurements at two different velocities are accommodated by these two 
choices of parameters. 

Theory 3 is “better” than the old simple harmonic oscillator law of Theory 1, 
because it accounts for all the data. It accounts for equal half periods, and accounts 
for the measurements when the initial velocity is at v = 10 m/s and at v = 15 m/s. 
However, Theory 3 is not the only theory that could do this. We could have had an x® 
correction, for example, that would have fit just as well this limited amount of data. 
Dissatisfaction may set in that we cannot be confident of any precise formulation 
of the theory to describe the system. If arbitrary corrections are allowed now, then 
anything goes. 

This is both the beauty and the frustration of effective theories. Being commit- 
ted to the notion that all terms should be allowed in a potential consistent with 
the symmetries we believe to be sacrosanct, and then test them with ever increas- 
ing experimental sophistication, has given us insight that deviations from the pure 
harmonic oscillator potential are possible. However, these ideas of effective theory 
appear to have muddied the waters rather than have led to “the theory.” We come 
to the realization that this is one of the limitations of effective theories. By itself it 
cannot raise you to a deeper physical insight. It is merely a statement that all oper- 
ators (i.e., all corrections) should be added to your theory and then experiment can 
measure or put limitations on the couplings. However, if you do happen onto a deeper 
theoretical insight, that can put order to all the operators that may arise. 


2.7 Deep Theory Conjecture 


Now let us suppose that we let our success get to our heads, and we become supremely 
confident that we know of a deeper theory to explain the data. Nevermind how we 
came to it—that is not important here—but suppose the deep theory we become 
convinced of is 


V = w}Lr [1 — cos(x/Lr)] => mł + w4 Lr sin(x/Lp) =0 (Theory 4). 
(2.20) 


The data that has been taken to date suggests that 


wr = 0.63 s7? and Lr = 38.8m (Parameter Fit 4). (2.21) 
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We note that there is no difference between Theory 3 predictions and Theory 4 
predictions as long as the initial speed stays below 20 m/s and the timing resolution 
is not better than 0.01 s. 

However, we can make a bold prediction based on our new deep and fundamental 
theory conjecture: if the initial velocity is doubled to 30m/s the period jumps to 
11.23 s, whereas for Theory 3 the prediction is 11.36s. Experimentalists may puzzle 
over how to double the initial velocity for many years, but finally are able to do it. 
When they collect the data, they find Tperioa = 11.35 s +0.01 s, which is a dramatic 
confirmation of Theory 3, and the hubris of the conjecturing Theory 4 is defeated. 


2.8 Ultimate Test? 


After the extreme test of Theory 3, which was years in the making and passed so 
decisively and impressively, the smart people figure out lots of fancy language to 
explain why it had to be true and what symmetry properties it has. It is written in 
every textbook. However, there was one more experiment that people wished to do. 
For years it has been suggested that if you are able to reach initial speeds greater 
than 42 m/s the Particle will never come back. In other words, the initial energy will 
be so great that it will exceed the confining potential barrier of Theory 3. However, 
getting to 42 m/s is a technological nightmare, and it will take decades to do it. 

But let us suppose that after decades of R&D, it has been figured out how to launch 
the particle to speeds of 50 m/s from x = 0. When the experiment is conducted the 
particle flies off into the unknown. Twenty seconds go by, one minute goes by, an 
hour goes by, days and months go by, and the particle has never returned. Scientists 
are not surprised, but a little disappointed. It would be so much fun for a new anomaly 
to happen, but the theory looks solid and inviolate. 

The scientists may move on, and study other things like sandpiles and solar flares. 
But one day, many years later, the particle returns! And nobody knows why, except 
a bright young student who realizes that the next term in the effective potential may 
have been what returned it. 


Chapter 3 
Effective Theories of Classical Gravity 


Abstract If the concepts underlying Effective Theory were appreciated from the 
earliest days of Newtonian gravity, Le Verrier’s announcement in 1845 of the anom- 
alous perihelion precession of Mercury would have been no surprise. Furthermore, 
the size of the effect could have been anticipated through “naturalness” arguments 
well before the definitive computation in General Relativity. Thus, we have an illus- 
tration of how Effective Theory concepts can guide us in extending our knowledge 
to “new physics”, and not just in how to reduce larger theories to restricted (e.g., 
lower energy) domains. 


3.1 Introduction 


The purpose of these lectures is to introduce the concepts of Effective Theories 
to students of Philosophy, Mathematics and Physics who have a shared interest 
in the philosophy and history of physics. The concept I wish to discuss, Effective 
Theory, is a thoroughly modern notion; nevertheless, I wish to illustrate it with a very 
old and intuitively accessible problem in physics: Mercury’s anomalous perihelion 
precession. 

Le Verrier announced in 1845 a small discrepancy in the precession rate of 
Mercury’s perihelion compared to Newton’s theory, even after taking into account 
all the disturbing influences throughout the solar system such as the effect of other 
planets’ orbits.! This came as a surprise, and more or less nobody believed at 
the time that it was the fault of Newton, but rather the fault of observers who 
had not seen the other celestial bodies that must surely be perturbing Mercury’s 
orbit. Historically, that is the beginning of the problem. Le Verrier believed that 
an as-yet unobserved mass distribution inside the orbit of Mercury was the source 


' In 1859 Le Verrier gave a number for this advance: 35 arcseconds per century (Le Verrier 1859). 
It was later reevaluated by S. Newcomb (Newcomb 1882), who determined the correct value of 43 
arcseconds per century. 
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of the discrepancy. He and others advocated the existence, for example, of a new 
small planet (“Vulcan” as it was sometimes called) that would be observed when 
astronomers developed the instruments necessary to find it (Roseveare 1982). Such 
was not the case. By the 1890’s it became clear to most that new large-scale object(s) 
was not the explanation (Oppenheim 1920), despite some ill-fated protestations oth- 
erwise (Poor 1921). The resolution of the problem came with Einstein’s General Rela- 
tivity, which predicted precisely the 43” of arc per century observed, and the case was 
closed. 

However, I want to argue that anticipation of the “problem” could have occurred 
much before Le Verrier. What prevented scientists from anticipating Mercury’s peri- 
helion precession was not lack of mathematical skill, or lack of experimental abilities. 
It was solely due to not having the right mindset. Unlike perhaps in decades and cen- 
turies gone by, no competent scientist should retain an unfailing commitment to any 
theory. All theories are incomplete, even given that some theories are better than 
others. The code phrase of this mindset is Effective Theories. The concept is a pow- 
erful one that has born much fruit in theories of particle physics, condensed matter 
systems, and even cosmology. 

These notes are meant to be a somewhat pedagogical and technical exposition of 
the Mercury problem and the application of Effective Theory ideas to the problem. In 
some parts of this lecture I will follow an “alternative history” path with the scientists 
Alice and Bob who vaguely understand the importance of Effective Theories and 
who will devise a theory that can accomodate the perihelion precession rate well 
before Einstein’s General Relativity comes along, and may even be able to predict 
roughly the numerical rate of the precision and make predictions for other planets 
through “naturalness” arguments. The latter could have been possible after diligent 
reflections on the philosophical challenges of Newton’s theory. I will compute the 
General Relativity rate at the end, in order to show how elegantly it comes out of that 
more complete theory, and to show that it matches the Effective Theory “predictions” 
by Bob and Alice. And finally I will conclude with some more remarks on the meaning 
of the results. 


3.2 Orbits in Newton’s Theory 


To remind some students who have not seen celestial mechanics for some time, 
we begin with the computation of particle orbits in Newton’s gravity. The reader 
familiar with these basics should feel free to skim the section only for definitions 
and conventions that I will use later. 

We know that the orbits predicted by Newton’s law of gravity are respected quite 
well by the planets, and so any change in the equations of motion for the orbits will 
need to be small perturbations. In Newton’s gravity, a test particle with mass m orbits 
around a particle of mass M >> m according to the equations of motion derived from 
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the lagrangian 


l > a 
L = -mit +-— (3.1) 
2 r 


where œ = GMm with G being Newton’s constant. M represents the sun’s mass in 
this lecture, and m the planet’s mass, most often Mercury. It is appropriate to assume 
that M >> m such that any correction is negligible due to the difference of m from 
the reduced mass y = Mm/(M + m), which is technically the precise mass one 
should use in the kinetic energy term. Lagrange’s equations of motion are 


Pa Qa, 
mr = ——r (3.2) 
r 


where f is the unit vector in the r direction. 


3.2.1 Orbital Solution 


The lagrangian is rotationally invariant, and so the motion of the particle is most 
conveniently evaluated by casting the vector equation of motion into the two polar 
component equations 


mË — rọ’) = = (radial equation) (3.3) 
r 
m(2?¢ +r) = 0 (angular equation). (3.4) 


The second equation is equivalent to 
d 9.8 
— =0 3.5 
P7 (mr) (3.5) 


which implies that mr?ġ is a constant in time. At the apogee (furthest) or perigee 
(closest) point of the orbit the radius vector f is exactly perpendicular to the angular 
vector b and the magnitude of the angular momentum vector £ = r x p, where 
p= mrod, is exactly mr. Since angular momentum is conserved and mr?ġ is 
conserved, if they are equal at one point they are equal at all points in the orbit. Thus, 
the constant value inside the time derivative Eq. (3.5) is none other than angular 
momentum: £ = mr?@. This also proves that the motion is in a plane. Since angular 
momentum is a conserved vector quantity, the direction must also be preserved which 
is only possible if p perpetually lies in the same plane as r. This justifies our evaluation 
of a three-dimensional problem in terms of just two variables (r, @) in the plane of 
motion. 
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Let us now solve the radial differential equation to obtain an exact solution of 
the orbit for particle m. By rewriting r = 1/u, recasting all time derivatives as 
d/dt — od /d@ when possible, and recognizing that ġ = 1/r?m from conservation 
of angular momentum, one finds that the governing differential equation of motion is 


du as am (3.6) 
— +u = —. : 
do? g 
Interestingly, this equation takes the form of the harmonic oscillator equation. The 
solution is 


i) = ugesot = (3.7) 


where uo is a constant that is not determined by the theory but the particular circum- 
stances (i.e., initial conditions) of the system. In terms of the more direct variable r, 
the solution is 


2 


p £ 
— wh = —, and e = : 3.8 
EE where p ae and e = ugp (3.8) 


rg)= 
and it is assumed that @ = 0 is at perigee. p is sometimes called the lactus rectum 
of the orbit. 
The constant e is called the eccentricity with which one can classify an orbit as 
circular (e = 0), elliptical (0 < e < 1), parabolic (e = 1), or hyperbolic (e > 1). 
Focusing on the 0 < e < 1 case of elliptical or circular orbits, we find that 


_ Pp d _ P 
Fmin = 7——, and Fmax = 


: 3.9 
l+e l-e 3-2) 


The relation between the semimajor axis a of the elliptical orbit and the other variables 
is given by 
Tmin + Fmax 


a= =_= which implies p = a(l — e). (3.10) 


3.2.2 The Hamiltonian and Voff Description 


An alternative way to approach the problem is to compute the Hamiltonian and 
consider the orbit from the perspective of a one-dimensional effective potential for 
radial motion. I provide the very basics of this to remind the students of the formalism 
which is used by some papers relevant to the perihelion precession. We first expand 
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the lagrangian in terms of radial and angular coordinates starting from the identity 


i? = 7? ++ r? sin? oh? + r?6? 


= į}? + r7¢? (valid in the sin@ = fixed orbital plane) (3.11) 


The Hamiltonian is constructed as 


H=} qipi-L (3.12) 
i 
using the momentum factors 
OL aL ; 
wsos mr, and pọ = a mr-¢, (3.13) 


which implies 


2 n2 
E * (3.14) 


~ 2m 2mr2 r 


The Hamiltonian is independent of ¢, which implies from Hamilton’s equations of 
motion, 


dH . OH 


p=-—, and q=—, (3.15) 
dp 


that pg is a conserved quantity: 


0H 
Do = ab =0 = pọ = const. (3.16) 


This of course is just a restatement of the conservation of angular momentum 
L= pp =m ġ. (3.17) 


We can substitute Eq.3.17 back into Eq.3.14, which gives a one-dimensional 
Hamiltonian as promised: 


2 g2 
Pr a (3.18) 


~ 2m 2mr2 r 


The Hamiltonian is a constant of the motion—the energy of the system—and it is 
useful sometimes to consider the dynamics of particle motion from this considera- 
tion where E = H = T + V and T and V are the kinetic and potential energies 
respectively. Here the potential for the one dimensional motion is often called the 
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effective potential and is given by 


e a 
Ver f(r) = a (3.19) 
It is this potential that governs the radial potential with the first term pushing the 
particle away from the origin and the second term attracting the particle to the origin. 
The balance giving orbital motion between two turning points of zero radial kinetic 
energy, the apogee and perigee. 


3.3 Perihelion Precessions from Perturbations 


From the previous section we know that the orbit from Newton’s simple 1/r? force 
law is 


u() = -5 = Za + ecos). (3.20) 


This obviously does not allow any advancement of the perihelion. The minimum is 
where du/dġ = 0, which implies sin ġ = 0 and therefore @ = 0, 27, 47, ... mark 
the successive perihelions. The discovery of the anomalous perihelion precession of 
Mercury, if it can be established, would signal the end of the Newtonian era and 
initiate the search for a better theory. As the reader is no doubt aware, perihelion 
precessions exist for every planet’s orbit (see Table 3.1), but for the present let us 
continue on our theoretical discussion. 


3.3.1 1/r” Correction to the Central Potential 


Let us look at how the orbits change if we add a 1/r? correction to the potential of 
the gravitational interaction lagrangian. Let us call this Bob’s theory with lagrangian 


1 R 
ba we ae (1 + =) (3.21) 
2 r r 


where a = GMm, with G being Newton’s constant, M is the mass of the sun, and m 
is the mass of the planet under consideration. This new law requires the introduction 
of anew fundamental length scale Rpop, which is a priori unknown. However, we do 
know, as will be shown below, that it leads to a perihelion precession of the orbits 
governed by this law. 

Lagrange’s equation of motion for this theory is 
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2R 
radial : m(# — rg?) = — 5 (1 $ =) (3.22) 
F r 
angular : m(2r¢+rd) =0 (3.23) 


The angular equation yields conservation of angular moment £ = mr? = const 
just as before. Using this, we can write the radial differential equation as 


Cu 


am 
dg +u= pil + 2 Rbobtt) (3.24) 


This can be rewritten as 
2 


1 £ 
) u = —, where p = —. (3.25) 
p 


am 


du l 2 Rbob 
do? p 


The general solution to this equation, assuming perihelion is placed at ¢ = 0, is 


2 Rbob 1 
u(d) = ugcos| ġ,/1 + j (3.26) 
p P — 2Roob 


or, written differently, 


-( l ) j — Reb ) 4 3.27 
u(p) = p= lke ecos| @ EFA =F (3.27) 


where e = ug(p — 2R). 
The u(@) solution describes the motion of a precessing ellipse. The first perihelion 
by definition is at ¢ = 0 and the second perihelion occurs when 


| 2R 2 R 
p i= e ay pa oga (8.28) 
p 1 — 2Roob pP 


p 


The small perihelion advance is the deviation of @ from 27x and is 6 = 27 Rpob/p. 

Given our previous computations, we are now able to evaluate the relationship 
between the extra length scale Rpop and the perihelion advance of an orbit. In one 
case, if we have made a measurement of the perihelion advance, we can derive what 
value Rpop Must be to reproduce that value 


R= (1.16m) ( ô/ Torbit =) ( p ) ( Torbit ) (3.29) 


arcsec - century lau/ \ 1 year 


where p is related to the common parameters of the semimajor axis a = (fmin + 
’max)/2 and eccentricity e by the relation p = a(1 — e). 
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Table 3.1 Data for planetary orbits 


Planet Torbit (years) e a (au) p (au) Fmin (au) Tmax (au) 
Mercury 0.241 0.206 0.387 0.371 0.307 0.467 
Venus 0.615 0.007 0.723 0.723 0.718 0.728 
Earth 1.000 0.017 1.000 1.000 0.983 1.017 
Mars 1.881 0.093 1.524 1.511 1.382 1.666 
Jupiter 11.86 0.048 5.203 5.191 4.953 5.453 
Saturn 29.46 0.056 9.539 9.509 9.005 10.07 
Uranus 84.02 0.047 19.19 19.15 18.29 20.09 
Neptune 164.8 0.009 30.06 30.06 29.79 30.33 
Pluto 247.7 0.249 39.46 37.01 29.63 49.29 


Torbit is the time for one full revolution in earth years, e is the eccentricity of the orbit, a is the 
semimajor axis in astronomical units (1 au = 1.496 x 10!! m), p = L/GMm? = a(1 — e?) is the 
orbital latus rectum in astronomical units (and is independent of m ultimately), rmin = a(1 — e) 
is the distance of perigee in astronomical units, and rmax = a(1 + e) is the distance of apogee in 
astronomical units 


On the other hand, if we have a theory for what Rpob should be, we can make a 
prediction for the perihelion advance in units of arc seconds per century: 


ô 27 R 1 R 
ga (0.866 arcsec - century™!) ( =) C=) ( x) (3.30) 
Torbit PTorbit p Torbit 1km 


3.3.2 1/r> Correction to the Central Potential 


Alice’s theory has a 1/r? correction to the potential 


r 


1 Ri 
Latice = smi? + = (: + ie), (3.31) 


which gives a 1/r* correction to the gravitational force law. Lagrange’s equations 
for her theory are 


3 RŽ., 
radial : m(# — rọ?) = -5 (: + =) (3.32) 
r 
angular : m(27¢@ + rọ) = 0 (3.33) 


Here again the angular momentum £ = mr7@ is conserved from the angular equation, 
and the radial equation becomes 
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du 


am 
7 i gpt OR at ). (3.34) 


We’ll solve this equation employing techniques of perturbation theory. We treat 
the last term of Eq. 3.34 as a small perturbation and solve first the equation 


du am 
which is just the standard Newtonian orbit solution 
1 
un(d) = —(1+ecos@¢), where e = uop (3.36) 
p 


where the subscript N refers to the Newtonian solution, wo is an initial condition 
constant and p = ¢7/am is the usual value. 

The next step is to now substitute u — uy + ôu into Eq. 3.34 where we only keep 
one order in perturbation theory. Since uy part of this expression cancels the usual 
part of the differential equation from Newton’s law, we are left with a differential 
equation for the perturbation du: 


d*Su 3 5 


dg + ôu = p Price 


un ($) (3.37) 


3 
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To obtain the complete solution we need to solve for du. The theory of ordinary 
differential equations tells us that all we need is any particular solution, and here is 
one: 


ee) : e 2 2 
ôu = 73 Palice 1 + eġ sing + a pe sinf ġ (3.39) 


The perihelians of the orbit can be obtained by solving for @ in 


du Re 
—— = peg + 3 ae 
PT ġ S 


2 
x (e sing + ep cos d — ae sin 26 + 2e? sind cos 6) =0 (3.40) 
The existence of the ¢ cos ¢ term in this equation, which came from the ¢ sin ¢ term 


in du, is causing the perihelion on the next cycle to shift away from 27. Defining 
$ = 2x + ô we can solve for ô in the perturbative expansion: 


R3; 
5 = 6r e, (3.41) 
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Given our previous computations, we are now able to evaluate the relationship 
between the extra length scale Rajice and the perihelion advance of an orbit. In one 
case, if we have made a measurement of the perihelion advance, we can derive what 
value Ralice must be to reproduce that value 


Tore\ 2 6 gTa \ 1? 
Ratice = (7.58 x 10° m) (4 )( m) ( = ) (3.42) 


au/ \ years arcsec/century 


On the other hand, if we have a theory for what Rajice should be, we can make a 
prediction for the perihelion advance in units of arc seconds per century: 


8 6nR2.. Raice \ (lay 7 1 
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3.4 Philosophical Challenges to Newton’s Theory 


We pause here to describe some foundational questions that Newton’s theory faced. 
There are three main philosophical problems: (1) What is the nature of absolute time 
and space, and is it necessary to invoke it? (2) Why should the gravitational mass 
be equal to the inertial mass in the equations of motion? And (3) how does nature 
enable action at a distance responses? 

Regarding Absolute Space and Time, Newton sets forth his ideas in the first 
Scholium of Principia. Almost immediately upon the publication of his book, 
Newton faced criticism from noted physicists and mathematicians. The most famous 
adversary regarding this was Leibnitz, who claimed that the only thing that need be 
talked about, and which ultimately defines space and time, is the relative motions of 
objects (relativism). Appeals to absolutes make no sense. Newton’s friend Samuel 
Clarke argued vociferously for the absolute viewpoint (substantivalism). Their cor- 
respondences are famous, and illuminating in the history of science. Over time these 
discussions progressed from what some might think is word quibbling to important 
physics principles emphasized by Mach and Einstein to name just two. Pedantic 
rigor of thinking can lead to the thought processes that generate significantly better 
theories, and this philosophical problem is arguably an illustration of that. 

The second problem, why is the gravitational mass equal to the inertial mass in 
my mind is the problem that should have kept everyone sleepless for those many 
centuries when there was not an answer. Newton’s theory has nothing to say on 
the matter, except well, there it is. These masses are two separate beasts, and why 
they should be the same? The resolution of this issue is one of the core motivating 
principles behind General Relativity, which succeeds in giving a deeper explanation 
for this curious equality. 

The third philosophical problem is sometimes called the problem of action at a 
distance. There are two aspects of action at a distance. The first is why should two 
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bodies far removed from each other with nothing in between them feel gravitational 
attraction. Should not there be some “touching” or medium that carries the gravity 
force from one body to the other? This action at a distance occurs between particles 
separated by a large vacuum of nothing. This is hard to take. Even Newton was 
disturbed by it, especially the latter aspect. In 1693 he wrote his friend Richard 
Bentley “It is inconceivable that inanimate brute matter should, without the mediation 
of something else which is not material, operate upon and affect other matter without 
mutual contact ....” (Thayer 1953). 

The second aspect of the problem, which is related to the first, is how can two bod- 
ies far removed from each other in space instantaneously feel the effect of another’s 
gravitational force. Newton’s theory implicitly assumes that all particles feel all other 
particles’ gravitational attraction strength by the exact separations of those particles 
at each moment of time. If a particle moves just a little, everybody knows about 
it instantly and the resolution of forces are adjusted instantly. To Newton and oth- 
ers, action at a distance was intolerable, but the Newtonian system was the best 
thing going, and it had tremendous practical value, so it was not to be abandoned 
despite its flaws. 

The issue of instantaneity was noted from the start, and Laplace touched upon it 
in his highly influential Traité de Mécanique Céleste, published from 1797 to 1825. 
He stated that instantaneous propagation did not appear convincing,” and noted that 
Bernoulli had suspicions as well. But Laplace knew that if the propagation were 
indeed finite it would have to be extraordinarily fast, and even suggested, incorrectly 
as it turns out, that some observations imply that it is eight million times that of light. 
Laplace briefly brought up the possibility of modifying the inverse square law based 
on this potential objection but ultimately dismissed it, stating that the simplicity of 
Newton’s theory authorizes us to think of it as a rigorous law of nature.* 

Nevertheless, the philosophical challenges to Newton’s theory are enough to real- 
ize that it was not a complete theory. As we say often in physics today, there must 
be “physics beyond the Standard Model”. How might signal of “new physics” show 
up beyond Newton’s theory? Let us consider, for example, the disturbing underly- 
ing assumption of action at a distance. As we implied above, there are two different 
issues with action at a distance. There is the aspect of reaching across the mediumless 
vacuum, and there is the aspect of instantaneous transmission of information to all 
particles in the universe when one particle moves. 

Transforming our theory from reaching across the vacuum action at a distance 
to action by local contact is the subject of the theory of fields. Particles source 
fields that permeate spacetime, and other particles experience those fields. Thus, 
action at a distance is replaced by particle-field interactions in this classical point of 


2 “La propagation instantanée qu’ ils supposaient à cette force me parut peu vraisemblable” (Laplace 
1805). 

3 “En général, on verra dans le cours de cet ouvrage que la loi de la gravitation réciproque au carré 
des distances représente avec une extréme précision toutes les inégalités observées des mouvement 
célestes: cet accord, joint à la simplicité de cette loi, nous autorise à penser qu’ elle est rigoureusement 
celle de la nature” (Laplace 1805). 
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view. The emanating field propagates at finite velocity, which is incorporated self- 
consistently into modern field theories, retaining causality and introducing the more 
acceptable action by local contact. 

We do not need to fast forward all the way to the field theories of today to ask 
how Newton’s theory can be pressured experimentally by applying our philosophical 
worries. The most obvious way one should have thought to do it is by testing the 
instantaneous aspect of action at a distance. If one doubts that it is to be rigorously 
upheld, then we should expect that a quick movement of a body in a mechanical 
system might yield unexpected results since it might be significantly displaced from 
its original position by the time the other bodies “get word” of its flight, and it 
becomes ambiguous to know what direction and magnitude of force should be applied 
at all times. Thus, at some sufficiently high speed we might expect to see something 
unusual—something unplanned for in the Newtonian world. The trouble is, we do 
not know a priori what speed this breakdown would occur, and we certainly do not 
know what new description would be applicable. 

In circumstances like this, it is often best to write down effective theories that 
satisfy the symmetries of your worldview and do precision measurements to find 
deviations. The pattern of deviations or the values of couplings in the effective theory 
can lead to new insight when explained by a deeper theory. Bob’s 1/r? correction 
theory and Alice’s 1/r? correction theory to the gravity potential in the preceding 
sections do precisely that. They are Galilean invariant, and satisfy all the symmetries 
cherished even then: rotational invariance and translation invariance. 

We apply this approach of writing down corrections to planetary motion because 
this is our greatest hope to find cracks in the old classical world view. Since no 
cherished symmetries are violated by the additional terms we have found before, we 
may even expect to find breakdowns of Newton’s theory by the orbits of the planets, 
especially since they are accessible and moving faster with respect to each other and 
the sun than any laboratory system that could have possible been created on the earth 
at the time. Precision measurements of fast planetary motions thus had good reason 
to be the first place to find breakdown of Newton’s theory. No planet moves faster 
than Mercury. Indeed, it is Mercury where the first fissures arise, as we shall describe 
in the following sections. 


3.5 Effective Theories 


It is my contention that the concepts of Effective Theories, if understood and held 
by the early Newtonian scientists, would have led to a prediction that there must 
necessarily be an anomalous perihelion precession of Mercury and other planets, 
and that even the order of magnitude could have been guessed well before Le Ver- 
rier’s announcement in 1859. There was no barrier to adopting these ideas in New- 
ton’s day, as it requires no new special experimental knowledge, nor knowledge of 
Einstein’s relativity, but rather a more mature approach to how we think about the 
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laws of nature. In order to present this viewpoint, I shall first give a précis of the 
modern notions of Effective Theories. 

At its core, the term Effective Theory is short for a body of evidence that has led us 
to understand that “everything depends on everything else” may be true in principle 
but certainly not true in practice. In a restricted domain, the theory manifests sym- 
metries and properties that provide the ability to calculate observables without the 
requirement of making reference to features outside the domain. A simple example 
of this is that we can compute the trajectory of a football to any practical precision 
without needing to know the location of Uranus. The effects of Uranus on the tra- 


2: 
jectory are suppressed by a relative factor of ones ~ 3 x 107!4, where re is the 
U 


radius of the earth, dy is the distance from Uranus, and me (my) is the mass of the 
earth (Uranus). This is much too small to take into account for any practical need. 
The diminishing effect of Uranus as dy — œ is the principle of decoupling, which 
is at the core of Effective Theory utility and is the central reason why science works 
and we are able to compute and predict observables. 

A central concept of Effective Theory is the recognition that a full theory with 
heavy and light degrees of freedom can be written at low energies in terms of 
just light degrees of freedom after “integrating out” the heavy states or “coarse 
graining” over the small scales. We use “heavy” and “light” abstractly here, as it 
could refer to masses, momenta, velocity, etc. The chiral lagrangian of QCD, the 
Fermi theory of electroweak interactions, the Landau-Ginzburg theory of supercon- 
ductivity (Polchinski 1992) can all be recognized as an Effective Theory of a more 
fundamental theory. 

This top-down approach to understanding Effective Theories can give us a multi- 
tude of theoretical insights into the nature of simplified low-energy theories. It is this 
top-down approach that is traditionally how the power of Effective Theory concepts 
is demonstrated in particle physics (Cohen 1993; Rothstein 2003), fluid mechanics 
(Delgado-Bucalioni et al. 2005), material science (Abrams 2005), and essentially 
any other field that has a separation of scales. However, when considering theories 
from bottom up, the concepts we learn from Effective Theories can help us deduce 
modifications and additions to our present theories that can be tested by experiment. 
Success then can lead to motivations for inducing a more fundamental theory that 
reproduces the Effective Theory when restricted to its domain. It is this direction in 
theory analysis that I emphasize here for our present purposes. 

The insight that I would like to focus on, which I believe is the most powerful one 
when it comes to divining additions and modifications to theories, is the role that 
symmetries and naturalness play in the construction of the “complete” Effective The- 
ory. A symmetry is a recognition that something (a triangle, an equation, etc.) stays 
the same even if you make a closed set of transformations (i.e., group operations) on 
that object (rotations by 180°, interchange of x and y variables, etc.). All of our fun- 
damental theories have inherent recognized symmetries in them. We cannot proceed 
without these recognitions in the Effective Theory, because even the names we give 
to objects are merely shorthand notation for their symmetry properties (e.g., electrons 
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are spin-1/2 representations of the Lorentz Group with additional gauge symmetry 
representation labels). 

One of the principle consequences of the Effective Theory approach to establishing 
natural law is that all possible interactions (or “terms’’) consistent with the recognized 
symmetries of the Effective Theory are generically expected. There may or may not 
be additional terms that violate the symmetries, but terms that do not violate the 
symmetries must be included. In the realm of Effective Theories within quantum 
field theory, Weinberg, reflecting on the last three decades of work on the subject, 
has made the equivalent point that an Effective Theory may be considered self- 
consistent and not sick “as long as every term allowed by symmetries is included” 
(Weinberg 2009). 

In short, the precise form of a theory or law is not what is to be taken most 
seriously—it is the recognized symmetries. Upon sorting out the symmetries, the 
Effective Theory is to be developed with all possible terms consistent with the 
symmetry, and then qualitative expectations for experiment can be presented. What 
remains is measurement and pinning down the actual values of the coefficients to 
each symmetry preserving interaction term. 


3.5.1 Application to Newton’s Gravitation 


Newton’s law of gravitation is that the force between two bodies of masses m and 
M is inversely proportional to the square of the distance between them, with the 
proportionality constant being Newton’s constant G: 


M 
Lone i (3.44) 
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GMm 
Fr) = 2 
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where V (r) is the potential. In Book 3 of Principia, Newton states categorically that 
the inverse square law is “proved with the greatest exactness from the fact that the 
aphelia are at rest” and that “the slightest departure from the ratio of the square would 
necessarily result in a noticeable motion of the apsides....” (Newton 1999). Thus, the 
theory was created and solidified as a proposition to the world. 

Newton’s inverse-square law was so sacrosanct that few would ever doubt it. 
Immanuel Kant in 1747 used the inviability of the inverse-square law to derive 
that space had three dimensions. This is due to what we would say today is the 
conservation of gravitational flux lines emanating from a point mass through the 
surface of a sphere of arbitrary radius. God could have chosen a different gravity law, 
Kant says, and the number of spatial dimensions then would have had to be different.* 


4 “Zweitens, dass das Ganze, was daher entspringt, vermöge dieses Gesetzes [inverse-square law] 
die Eigenschaft der dreifachen Dimension habe; drittens, dass dieses Gesetz willkürlich sei, und da 
Gott dafür ein anderes, zum Exempel des umgekehrten dreifachen Verhältnisses [i.e., inverse-cube 
law], hätte wählen können; dass endlich viertens aus einem andern Gesetze auch eine Ausdehnung 
von andern Eigenschaften und Absmessungen geflossen wäre” (Sect. §10 in Kant 1747). 
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This rigid adherence to “god-given” specific law is ultimately incorrect reasoning, 
and it is in conflict with modern views of Effective Theories. 

The modern sensibility says that we should focus more on the symmetries, and 
then refashion the complete Effective Theory using them. What are the symmetries of 
the Newtonian world? The symmetries are that the laws of physics cannot be affected 
by one’s orientation in space, by one’s location in space, nor by one’s location in 
time. The laws must be invariant to any transformation of rotation, spatial transla- 
tion, or time translation. These symmetry properties go under the name of Galilean 
invariance. As a side comment, the Lorentz invariance of Einstein’s special relativity 
asymptotes to Galilean invariance in the low velocity limit (i.e., when v < c). 

The interaction term of Eq. 3.44 is merely one term in an infinite number of terms 
that could be written down that are completely consistent with Galilean invariance. 
An Effective Theory approach would be to introduce them all and investigate the 
consequences. There is no meaningful symmetry that demands only the inverse 
square law interaction. Assured of this, one example would be to embellish Newton’s 
law by 


CO 
vero) = om 1+ oan (2) f+ (3.45) 


n=) r 
where ro is some dimensionful Effective Theory length scale and 4, are dimen- 
sionless coefficients, which together with rp can be found by performing precise 
experiments. We should note that there are an infinite variety of other terms that 
could be added, including r/ and /* interactions, but we streamline the argument by 
looking only at one class of corrections that decouple as r — oo. 


3.5.2 Inevitable Perihelion Precession 


An extremely important conclusion can already be presented from the rules of Effec- 
tive Theories. Any deviation from the pure inverse square law will lead to a perihelion 
precession of the planets, and as the constructed Effective Theory demands additions 
to the inverse square law there will be an anomalous perihelion precession of the 
planets. On the other hand, we know that the inverse square law is approximately 
correct and thus we have added terms that decouple as r >> ro. The perihelion pre- 
cession of Mercury is very small, and so we expect that ro should be much less than 
the orbital radius of Mercury around the sun. In that case, we are justified in looking 
at the first-order corrected potential, which we can write as (Airo —> R): 


Vi(r) = om (: + *) : (3.46) 
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By these arguments of Effective Theory, an anomalous perihelion precession of Mer- 
cury is inevitable. It is only a question of what value does R take, which then sets 
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the numerical value of the precession. In the subsequent sections we discuss some 
arguments for what R might be, from the vantage point of pre-special relativity and 
pre-general relativity days, and make rough quantitative predictions for the preces- 
sion rate. 

Up to this point we have argued that the focus should have been more on the 
symmetries of the gravitational theory rather than the concretization of the theory. 
A more complete Effective Theory for Newtonian gravity would have been accepted 
and one would have fully expected anomalous perihelion precessions of the planets. 
A potential similar in form to Eq. 3.46 would have been put forward, and the task of 
theoretically divining or experimentally measuring R would have been the consuming 
activity. 


3.6 Mercury’s Anomalous Perihelion Precession 


Let us imagine that Bob and Alice are two physicists who are working in the post 
Le Verrier and pre Einstein era. They are smitten by the Newtonian worldview. 
They do not wish to do radical things to explain this perihelion precession. They are 
well-versed in the concepts of Galilean Invariance, Hamilton’s Principle, and have 
an inkling of the ideas of effective theories. Naturally, they want to describe this 
precession through a Galilean invariant effective theory of gravity. Bob announces 
that he wishes to add a 1/r? correction to the lagrangian. Not wanted to follow in 
Bob’s footsteps, Alice declares that the force law should be even powers of 1/r and 
so her first correction to the lagrangian is 1/r>. The two lagrangians are 


R 
mi? Ẹ a (1 + =) (3.47) 
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where a = GMm, with G being Newton’s constant, M is the mass of the sun, and 
m is the mass of the planet under consideration. These are the two lagrangians of 
Bob and Alice that we studied in a previous lecture. These new laws of Bob and 
Alice require the introduction of a new fundamental length scale R;. They do not 
know what that length scale is, but they have hopes that the new data will pin it down 
for them. 

Before we look more closely at Bob and Alice’s theories, we should remark again 
that in the classical history of gravity, there were early attempts to explain anomalies 
by changing Newton’s laws, even in the manner of Alice and Bob. Such theories 
go under the name of “Clairaut laws”. Clairaut proposed in 1745 that Newton’s law 
should be corrected by a 1/r* force term in order to explain some thought-to-be 
anomalies in the movement of the lunar perigee. However, he found in the end there 
was not a discrepancy, which buried such laws deeper into the dustbin of history. 
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Table 3.2 Anomalous perihelion precession rates of the planets compared to expectations from 
Newton’s law of gravity and taking into account all other sources of precession (effects of other 
planets orbits, etc.) (Duncombe 1956) 


Planet 5/T (arcsec/century) 
Mercury 43.11 + 0.45 

Venus 8.4 + 4.8 

Earth 5.04 1.2 


More modern references test gravity (including precession rates) through the parameters of the 
so-called parametrized post-Newtonian (PPN) approach (Will 2005) 


Newcomb commented in 1882 that such laws were “out of the question” because 
they disrupted the gravitational strength so wildly at very close distances where 
the correction term would come to dominate (Newcomb 1882). As late as 1910 
Newcomb, the world’s leader on this issue, was stating that all the data up to that 
point “... seems to preclude the possibility of any deviation from that law [Newton’s 
inverse-square law]” and that Mercury’s perihelion advance is best explained by 
“the hypothesis of Seeliger” (Newcomb 1910), which was a zodiacal light theory 
that contained intra-Mercurial distributions of orbital matter minimally disruptive to 
all other astronomical observations except Mercury’s perihelion advance (see, e.g., 
Chap.4 of (Roseveare 1982)). 

Bob and Alice’s theory are a return to the Clairaut law in some ways. In the 
next few subsections we merely state the effect they would have on planetary orbits. 
After a discussion of Effective Theories and how they apply to this problem, we 
shall proceed with a somewhat fanciful alternative history of how deviations from 
Newton’s laws could have been explained and interpreted from the point of view 
of Effective Theories after the anomaly was announced by Le Verrier. But it should 
be kept in mind, and will be emphasized again in the concluding section, that these 
theories could have been anticipated, and perhaps even should have been anticipated, 
before Le Verrier’s announcement. 


3.6.1 Analyzing Bob’s 1/r Correction Theory 


From Eq.3.30 we can compute in Bob’s theory that it is necessary that Rbob = 
4.4km if Mercury is to have the measured 43s of arc per century in its perihelion 
precession. Given this value of Rpob, Bob can make predictions for the perihelion 
advance of other planets. Using Eq. 3.28 he finds 5/Torpit = 8.6” of arc per century 
for Venus’s perihelion precession and 3.8” for the earth. These predicted values 
compare favorably to the measurements for Venus and Earth presented in Table 3.2. 
The predictions are well within the errors, and Bob is pleased because he has found a 
way to explain the anomaly while yet retaining Galilean invariance as a fundamental 
symmetry of spacetime. He has done this through the means of a simple expansion 
correction to Newton’s law of gravity. Nothing radical was done. 
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Despite the successes, Bob is not totally satisfied. He wants to know if he can 
argue for this new length constant in nature Rpop. It’s a very strange distance 4.4km. 
He wonders how he can formulate this distance from all the invariants swirling around 
him. It should not depend on the mass of each planet, he reasons, because we have 
just shown that one value of Rpob appears to work universally well for all planets. 
The other options we have to build a length scale are from Newton’s constant G, the 
mass of the Sun M and angular momentum. Bob fails to find any natural combination 
that will give 4.4km. 

Before giving up he recalls that his intuition has told him that there is some 
characteristic high speed such that Newton’s simple laws become strained (see 
Sect.3.4). He does not know what that speed value is, and his new law is just 
as much action at a distance as the old one, but he carries on by giving this new 
speed a name, Vpop. With this new undetermined speed in hand he realizes imme- 
diately that he can form a new length scale GM/ Ve ee Can this be the origin of 
Rbob? What value must vbob be to recover Rpob = 4.4km? A simple calculation 
yields 


GM 


= 1.7 x 10° m/s. (3.49) 
Roob 


Vbob = 


This quantity vpop that Bob has derived is a very curious number! His colleagues 
down the hall have been working on the theory of electromagnetic phenomenon and 
a speed very close to that keeps showing up in their equations, c = 3.0 x 108 m/s. 
This is the propagation speed of light. He decides this cannot be a coincidence, but 
he is not sure what to make of it. He decides to define a new scale based on these 
thoughts, the “sun’s electro-gravity scale” Reg = GM/c?. Rpop can now be written 
in terms of this definite scale Rpob = AbobReEG. It is very curious that the data fits 
very well if Apop = 3 is an integer. He writes on a piece of paper his new theory of 
gravity 


1 GM GM /c2 
in- si? + a (1+3 i ). (3.50) 


r r 


and he is pleased with its simplicity, elegance and symmetry. He does not know how 
the speed of light c crept in, but he is satisfied since his lagrangian looks “natural” 
given that there are no really big or really small numbers populating it. Furthermore, 
he knows that if he must construct a new length scale with a speed, the “natural” next 
known threshold of speed is the speed of light, and so this correction is “natural” to 
explore. He feels he is on to something big. 

Bob finds another interesting connection with this scale GM /c?. He recognizes 
that there is a small radius Rpg of a infinitesimal (1.e., radius less than Rg) spherical 
body of mass M for which an object going the speed of light would not be able to 
escape. This light-speed trapping radius is a curiosity: if light were corpuscular in 
any sense, as Newton and others thought it might be, then we could see no light 
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emanating from within the radius Rg of the massive body. This sets a mystery scale 
to gravity that requires further scrutiny and may be a length scale associated with 
changes in gravity. The computation of this scale is simple in the Newtonian world, 
and is 


2GM _. : 
Rg = 2 (light non — escape radius) (3.51) 
This is only a factor of two different than the value of R he has derived from the 
perihelion precession rate. It should be noted that Rg is the precisely the Schwarz- 
schild radius derived in General Relativity, which is a well-known special scale for 
spherically symmetric objects for more reasons than just what was stated above 
(Schwarzschild 1916; Wald 1984). Furthermore, it should be recalled that the speed 
of light was being quantitatively estimated (Rømer 1676) even before Newton’s 
Principia, and by 1729 it was known to within a few percent (Bradley 1729), 
and so this scale had precise meaning from the very beginning days of Newtonian 
gravity. 

Despite these interesting connections, Bob gets nervous looking over his equa- 
tions. Equation 3.27 seems to indicate that if o < 2Rpob = 6GM je, the orbits do 
not make sense anymore, as the equations formally say r < 0 which is nonsensical. 
He relaxes briefly when he realizes that 2 Rpob is only 9km, which is well below the 
orbital radius of any planet, and furthermore it is even below the radius of the sun, 
which is 7 x 10° km. Thus, there is no danger that some small object rotating around 
the sun would have no chance to be described by Bob’s theory, since it would be 
inside the sun. 

Nevertheless, he is still a bit uncomfortable. Nowhere in his derivation was the 
radius of the sun ever required. In principle, all that mass of the sun could have 
been at one infinitesimal point for all the equations knew. Nevermind how to pack 
all that mass in with a radius less than 9km, it is a possibility in principle that 
such a tightly packed object exists, and if it did, there is no way his theory could 
describe close-by orbits with characteristic orbital latus rectum size p < 9km. He 
knows his theory cannot be the end all of all the theories anyway due to not knowing 
why c crept into his equations, despite that being the natural next “speed scale” to 
consider, but now he is even more discomfited because he can imagine configurations 
where his theory just cannot even give an answer. But that is for another day. He has 
succeeding in explaining the precessions of Mercury, Venus and Earth and that is 
enough for a day’s work. And that is what Effective Theories do. They explain the 
day’s work—Bob clearly has made progress—but there is more to be learned and 
understood. Effective Theory practitioners understand that all possible questions 
cannot be resolved instantly, and that there are necessarily deeper effective theories 
to come. 
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3.6.2 Analyzing Alice’s 1/r> Correction Theory 


Alice now wishes to make definite her lagrangian with 1/r? potential corrections by 
specifying the value of Ralice from Mercury’s anomalous perihelion precession and 
then predicting what the other precession rates are. Upon fitting Mercury data she 
finds Ratice = 9.04 x 10° m. Using that fixed value for all planets she then predicts 
ô/ Torbit = 4.43” of arc per century for Venus and 1.4” of arc per century for the 
Earth. The Venus result is nearly 20 off compared to the measurement, and the Earth 
result is about 30 off of the measurement (see Table 3.2). Alice has a choice now. 
She can say her theory predicts that further refined measurements of the precession 
rates will yield smaller central values of the precession rates for Venus and Earth in 
concert with her theory. Or, she can take the 30 discrepancy seriously and attempt 
to modify her theory. 

Alice makes the right choice and seeks to modify the theory. She computes what 
Ralice needs to be for each planetary case to precisely hit the measured values. She 
finds 


Rice =90%10'm, RY = 1.3 x 10'm, RE =15x10'm. (3.52) 
Similar to Bob, she begins to think about how these length scales can be identified 
with all the quantities that she has available to her in the problem: M, mplanet, and £. 
She cannot come to a satisfactory answer. These constants alone are not enough to 
form the length scales of Eq. 3.52. 

However, in Alice’s trials she notices something interesting. The Ralice lengths are 
proportional to angular momentum divided by mass of the planet, es x £;/mi, 
with the same proportionality constant. This constant has the dimensions of an inverse 
velocity. She decides to call it Valice and solves for its value: 


Rhice = mA EEE S m/s (3.53) 
mi mi R alice 

Alice also has colleagues that work on electromagnetism and she recognizes this 
value as exactly the speed of light, valice = c. How did that happen? She does not 
know, but she is surely excited about the result, as she too recognises that c is the next 
fundamental “speed threshold” and so is a “natural” value in the Effective Theory 
correction. She has explained all the planetary precession data. She writes down on 
a piece of paper her new theory of gravity, 


1. GMm 1 €2/m? 
Latice = smi? + - (1+ / ). (3.54) 


Z R 


which like Bob’s theory possesses symmetry and has a measure of elegance and 
simplicity. 
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As she reflects on her theory she realizes that since angular momentum is £ ~ mrv, 
where v is the velocity of the planet orbiting the sun, the second term inside the 
parenthesis can be thought of as an m-independent v? /c* correction to the Newtonian 
gravitational potential. Thus, she believes that she will be the first to show that the 
simple inverse-square law of Newton is corrected by factors of v?/c?. As the speed 
of the planet gets closer to the speed of light, Newton’s theory begins to crack. So 
far the basic assumptions of spacetime symmetries—Galilean Invariance—are not 
breaking down, just the simple form of Newton’s theory of gravity. Despite these 
successes of her theory, she remains slightly dissatisfied with one aspect. How can 
she convince herself, much less others, that her theory is better than Bob’s? Surely 
one or the other or some combination of these corrections are required by nature, 
she reasons, but can they be determined from deeper theory principles? The answer 
is yes, and Einstein’s General Relativity is that theory. 


3.6.3 Gerber’s “Utterly Worthless” Theory 


Before going to Einstein’s General Relativity, let us comment briefly on velocity 
dependent approaches to augmenting Newton’s law. Manipulations of the Newtonian 
potential were initiated in earnest well after Laplace’s work with the goal of rigorously 
incorporating finite speed effects of gravity. The most straightforward approaches 
failed. However, Paul Gerber proposed in 1898 (Gerber 1898) a velocity dependent 
potential correction that correctly accounted for Mercury’s perihelion precession: 


V(r, v) = -= (1 = S (3.55) 


where c = 3 x 108 m/s is the speed of light, and v is the velocity of Mercury in the 
Sun-Mercury center of mass system. 

Gerber’s theory captured the attention of many due to its combined simplicity and 
effectiveness in accommodating Mercury’s anomalous perihelion precession rate. 
For example, Mach wrote, “Only Paul Gerber [reference to 1898 paper] studying the 
motion of Mercury’s perihelion ... did find that the speed of propagation of gravitation 
is the same as the speed of light” (Mach 1901). He was attacked for not giving good 
reasons for his theory—a topic we shall take up below—but he did provide a simple 
theory that worked. It was also a “natural” theory due to its utilization of c as the 
next fundamental speed scale of the theory. 

Seventeen years after Gerber’s potential, the question of Mercury’s perihelion 
precession was resolved powerfully by Einstein’s GR (Wald 1984). At low velocities 
the first-order correction to gravitational attraction of Gerber’s theory matches the 
first-order correction of Einstein’s theory. However, Einstein’s approach had coher- 
ent principles and unassailable logic, and thoughts about Gerber’s theory quickly 
faded away. 
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Despite the success in accommodating Mercury’s perihelion precession, Gerber 
was roundly criticized for his theory. The strength of the reaction that Gerber faced 
seems harsh for somebody who actually did write down a simple theory of no free 
parameters with the speed of light in it that worked. It is as though the deep thinkers 
at the time knew there was something appealing about Gerber’s work, but could 
not quite put their finger on it, and so harshly criticized it as a community building 
exercise to dismiss that kind of apparently principle-less approach to physics. 

Einstein, commenting on Gerber’s theory well after he had developed his own 
theory of General Relativity summarized the attitudes well: “But specialists in the 
field agree not only that Gerber’s derivation is thoroughly incorrect, but that the 
formula cannot even be obtained as a consequence of Gerber’s leading assumptions. 
Mr. Gerber’s paper is therefore utterly worthless” (Capria 1999) (italics are mine). 
This appears to be an overly strong dismissal of Gerber’s simple theory that gained 
so much attention. 

Pauli, in his famous Encyclopedia article on Relativity said, 


Recently, an earlier attempt by P. Gerber has been discussed which tries to explain the peri- 
helion advance of Mercury with the help of the finite velocity of propagation of gravitation, 
but which must be considered completely unsuccessfully from a theoretical point of view. For 
while it leads admittedly to the correct formula—though on the basis of false deductions—it 
must be stressed that, even so, only the numerical factor was new. (Paul 1981) (italics mine) 


Whatever can be said of Gerber and his theory and the faulty logic behind his 
theory, it was not “utterly worthless” or “completely unsuccessful’. I believe it was a 
crude attempt at effective theory analysis. It was something he may have intuited but 
was unsuccessful in articulating well due to the mindset and style of physics of the 
day. Back then, no term was allowed to augment a theory without it being derived 
first from a deeper principle. The standard rigor of the day was that laws were exact 
by argument and deduction, and any deviations or changes must be accounted for by 
a replacing new principle. 

An excellent example of this prevailing attitude is provided by Max Born in his 
book on Einstein’s theory of relativity (Born 1924). He describes briefly the case of 
Mercury’s anomalous perihelion precession and then goes on to harangue all those 
people before Einstein who generated ad hoc solutions to the problem: 


Changes in the laws [Newton’s laws] have been proposed, but they have been invented 
quite arbitrarily and can be tested by no other facts, and their correctness is not proved 
by accounting for the motion of Mercury’s perihelion. If Newton’s theory really requires a 
refinement we must demand that it emanate, without the introduction of arbitrary constants, 
from a principle that is superior to the existing doctrine in generality and intrinsic probability. 
Einstein was the first to succeed in doing this. 


This attitude is partially in conflict with our understanding of Effective Theories 
today. The introduction of arbitrary constants is a key step in the construction of 
Effective Theories, and the role of experiment is to pin those down. If anything, the 
ad hoc inventors of changes in Newton’s law were too sheepish about introducing 
arbitrary parameters, and instead got tangled up with incoherent “deep reasons” for 
their particular laws. 
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Effective Theory is an intermediate step between an old regime (e.g., Newton’s 
laws) and a new regime (e.g., Einstein’s General Relativity), and this intermediate 
step necessarily has “arbitrary couplings” and does not “emanate from a principle 
that is superior to the existing doctrine”. Instead, it says that the existing doctrine 
should be taken to its utmost seriousness (e.g., Galilean invariance) and data should 
fit the parameters of all allowed interactions, and perhaps a deeper new theory can 
come along later to explain the relations among those parameters. 

Although Gerber’s theory was not worthless, it is not as valuable as Einstein’s 
General Relativity. Alice and Bob’s effective theories would not have been worthless 
either had they written it down much earlier. They would have been an intermediate 
stepping stone from one principled theory to the next that would have predicted the 
existence of Mercury’s perihelion precession and motivated earlier discovery of the 
phenomena. 


3.7 Perturbation from General Relativity 


We have talked about Einstein’s General Relativity being the deeper theory that 
explains Mercury’s perihelion precession. It is worthwhile in these lectures to go 
through that computation to see how it comes about. 

We wish to compute the trajectory of a particle subject to a central, radially 
symmetric gravitating source in the general approach followed, for example, by (Hartl 
2003). The metric applicable for this computation is the Schwarzschild metric: 


d 2 
ds? = —n(r)c?dt? + es + r?d0? + r? sin? 0de? (3.56) 
HY 
where 
2GM i 2 
nr) =1—- Z =1- 2, where r, = 2GM/c (3.57) 
Cr r 


The quantity rs is the Schwarzschild radius. This defines the metric tensor to be 
a -1 2 4.2.2 
Sop = diag(—n(r), nr), r°, r° sin” 6) (3.58) 


in the (t, r, 9, @) basis. Note that the signature of the metric (asymptotically weak 
field far away) in normal rectilinear coordinates is g% = diag(—1, 1, 1, 1). 

The Schwarzschild metric is unperturbed by making shifts in the time direction 
and by making shifts in the angular direction ġ. These define Killing vectors Eine = 
(1, 0, 0, 0) and EAI = (0, 0, 0, 1). The nice property of a Killing vector is that when 
dotted into the four-velocity vector dx“/dr the result must be constant along the 


geodesic motion: 
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d dx? 
pai = gaps? —— = const. (3.59) 


Applying this theorem to the Schwarzschild metric gives 


ü dx? dt 
Sep 5time Ge = nr) =f (3.60) 
dxf . do 
Sapo y = r- sin? 07 =o (3.61) 


where cı and c2 are mere constants. We know that independence of time implies 
conservation of energy, and we also know that independence of rotation implies con- 
servation of angular momentum. Thus, we know that cı is some function of energy, 
and we know that c2 is some function of angular momentum as we usually define 
the quantities. However, at this stage we do not know the precise correspondence, so 
it is prudent to just carry the constants cı and c2 with us until the precise relations 
become obvious. 

From Eq. 3.61 we solve for dt/dt = ci/n(r) and dọ /dt = c2/r? sin? 6. Now, 
we should simplify this all by taking the orbit in the 6 = 2/2 plane and sod¢/dt = 
c2/r?. Please note, conservation laws have given us this, and this is where deep 
physics lies. Now, let’s expand out the defining equation of the four-velocity 


dx” dx? i which gi (3.62) 
88- gg = —1, which gives ; 
dt\*> 1 (dr\? (doy 
N ai 3.63 
no(a) +50) 2 d (a) e 


for the Schwarzschild metric. Substituting the values of dọ /dt and dt/dt that we 
obtained above from the Killing equations, we find 


c? p! 1 (2) +3- i (3.64) 
nr) n(r) dt E l 


After carrying out some algebra one finds 


(3.65) 


2 2 2.2 2 
mc e D= Le dr GMm rs mcc — GMmce;5 
2 dt r 2r2 r3 

The form of Eq. 3.65 is very suggestive of our equation for energy of a particle in 
an orbit, and the correspondence becomes precise if we make the identifications 


2 2 
l 
—SG-DSE and c2 (3.66) 


a m?c? 
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We also can identify t = ct in the non-relativistic limit. It turns out that this substi- 
tution is acceptable for the problem at hand as long asr < rọ, which is generally 
the situation for low eccentricity orbits, and certainly the case for the planetary orbits 
of our solar system. Making these identifications the energy equation becomes 


roly pn C _GMm (Eme en 
=” Nar 2mr? r r? i f 


This is the energy equation for a particle in Newtonian gravity except for the small 
shift in the effective potential 


AVer f(r) = 


Di 22 
n ( /m*c ) (3.68) 


r2 


which is precisely the same correction to Newton’s theory we derived earlier from 
Alice’s effective theory approach to explain Mercury’s precesion in Eq. 3.54. 

There are multiple ways to derive the correction to Newton’s gravity law for the 
particular problem of perihelion precessions. In our derivation, we found Alice’s 
theory correction. This is also the result derived in General Relativity by many other 
authors (see e.g., Schutz 1990; Goldstein et al. 2002; Hartl 2003). However, another 
approach to the General Relativity derivation gives Bob’s theory, and that has been 
demonstrated by a set of different authors (see e.g., Paul 1981; Landau and Lifshitz 
1975; Iwasaki 1971; Donoghue 2009). These two theories, if treated as god-given 
complete theories, are not equivalent. However, they are equivalent results for this 
problem as all approximations and culling of the General Relativity terms have been 
carried out with the sole purpose of finding the perihelion precession. In the end, the 
precession rate angle per orbit period from either correction is the same: 


61 GM /c? 
= —__ 3.69 
a(l — e?) ( ) 
Algebraically, the orbital identity 
L = GMma(1 — e°). (3.70) 


is what guarantees that the two solutions predict the same anomalous perihelion 
precession rate. So, we see that Albert explains both Alice’s theory and Bob’s theory, 
and puts them on firmer footing. 
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3.8 Conclusions 


At the beginning of these lectures we decided that Newton’s law of Gravitation 
was very successful in describing the orbits, but that it is not the precise law that 
captures our most profound admiration. Rather, it is the symmetries that the theory 
possesses. We elevated those to the highest principles and constructed reasonable 
effective theories that could be expected by the data. We illustrated the results with 
two theories: Bob’s 1/r? and Alice’s 1/r? potential correction theories. Both theories 
were able to account for the perihelion rate naturally. We even made the case that 
philosophical challenges to Newton’s world view, if taken seriously, could presage 
the size of Mercury’s correction that was actually measured by Le Verrier. This is 
done with the aid of “naturalness” arguments about the speed of light being the next 
speed scale of nature by which to construct corrections to Newton’s potential. In this 
way the concepts of natural effective theory have some predictive power. That power 
is certainly qualitative, but also to some degree quantitative. 

Einstein had keen insights into the nature of space and time and developed the 
theory of General Relativity based on them. It describes gravity at a deeper level, and 
one of its first orders of business was to compute the anomalous precession rate of 
Mercury to see if it could account for the discrepancy between Newton’s theory and 
measurement. The answer is yes, and we have shown that this correction matches 
nicely the effective theory results of Alice and Bob. 

Einstein’s General Relativity theory is “better” than Alice’s theory or Bob’s theory 
for two reasons. First, it gives a deeper principles understanding of the correction 
with no additional free parameters. This deeper understanding is nothing other than 
further assumptions on spacetime symmetries that panned out. Second, Einstein’s 
theory is a more complete theory of gravity that makes additional predictions (such 
as bending of light, and binary pulsar spin-down) that are confirmed by data. Alice 
or Bob’s theory clearly cannot match the riches of General Relativity and so cannot 
be considered as fundamental as Einstein’s. 

Despite Bob and Alice’s theory coming up short, the general lesson remains. 
Newton’s theory was an effective theory, which is in some aspects superceded in 
success by Bob and Alice’s effective theory, and Bob and Alice’s effective theo- 
ries are superceded in success by Einstein’s General Relativity. The obvious next 
question is whether Einstein’s General Relativity theory can be succeeded in suc- 
cess by another theory. A deeper theory that perhaps could be explained as effective 
theory expansion of Einstein’s theory for the purposes of solving some lower energy 
precision measurement problem. There is little doubt that is the case (Donoghue 
1994). 

Finally, one of the most profound shifts in our thinking over the decades, illus- 
trated well by the Perihelion precession example, is that it is really no longer appro- 
priate to speak of “the correct theory.” There is no correct theory. Our tasks are 
to improve theories via the effective theory approach, to seek deeper and simplify- 
ing assumptions that account for it, solidify those into a new theory, and then treat 
that new theory as an effective theory, and repeat. These steps are accomplished by 
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continually improving and refining observations and theory computations that enable 
us to choose between effective theories, followed by deducing deeper new symme- 
tries that force its inevitability. Theories are never to be trusted—they are always 
“wrong” in the end—and with concerted effort we can even anticipate when and 
how they will break down. 

The concepts of Effective Theory lead one to predict qualitatively that a perihelion 
precession of Mercury was a priori guaranteed even knowing only the experimental 
facts of the Newtonian era. In particular, elevating symmetries above the concretiza- 
tion of hypothesized law, in this case the rigid devotion to the inverse square law, 
is the basic ingredient that would have led unambiguously to this conclusion. The 
general approach to science during the Newtonian era required almost complete 
devotion to concrete laws and their propositional justifications, which impeded its 
progress toward developing theory enhancements guided by symmetries and natu- 
ralness. Gerber, a school teacher who was perhaps not as indoctrinated in this more 
rigid fashion, found a potential that worked yet then made unjustified arguments for 
why it should be true. Effective Theories give the best of both words: deep but modest 
justifications for theories that can anticipate data and fit the data. 

We have also shown that even during the time of Newton a reasonably well 
supported hypothesis for the perihelion precession of Mercury could have been put 
forth that is close to the actual experimental result of 43” of arc per century. This is a 
clear illustration of how the ideas of Effective Theory can be utilized to extrapolate 
modestly beyond the rigidly set forth laws of fundamental physics. 
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Chapter 4 
Effective Theories and Elementary 
Particle Masses 


Abstract The concepts of effective theory have a rich history in particle physics. 
The early days of effective theories have many examples, including Fermi’s theory of 
nucleon decay and chiral lagrangian dynamics for pion scattering. These examples 
are touched upon briefly before going to the most pressing issue of today, which 
is the origin of elementary particle masses. The problem of mass generation is first 
described, where it is shown that simply writing down mass terms manifestly breaks 
cherished symmetries. It is then shown that spontaneous symmetry breaking cures 
this problem. The influence of effective field theory is then addressed, where it is 
shown that the smallness of neutrino masses nicely conforms with our intuition, but 
the weak-scale value of the Higgs boson mass is confusing. The chapter concludes 
with an essay describing this mystery and what the resolutions might be. 


4.1 Introduction 


Effective theories play a central role in particle physics. Perhaps the most famous 
effective theory of them all is Fermi’s four-fermion interaction theory that described 
nucleon decay and muon decay. The theory is a “V-A theory” (vector minus axial 
vector interaction) and has the form: 


G z 
a= -5 vey" A — y?) fe fayd -VÒ fy (4.1) 


where Gr = 1.15 x 1076 GeV~? is the Fermi constant determined by experiment. 
These operators can then induce 6 decays of the neutron via the constituent quark 
decays d — uev, and can also induce muon decay through y — ev, Ve. The history 
behind determining the precise nature of this interaction is a fascinating one that 
required painstaking experiment and insightful theory (Renton 1990). 
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We know now that the Fermi theory is just the low-energy limit of the electroweak 
theory of the Standard Model.' The Fermi constant G r that gives the strength of the 
four-fermion interaction is the low-energy limit of a W-boson propagator multiplied 
by its couplings to the two bilinear currents: 


2 2 
=f g GF 
a e (4.2) 
My [v2 


where g is the SU (2); gauge coupling of the Standard Model. Thus the propagator of 
the W-boson at very low energies compared with the W mass contracts to a point and 
makes an effective four-fermion interaction term governed by the Fermi Effective 
Theory coupling constant GF. 

Another place where Effective Theories are put to good use is in low-energy 
pion scattering theory. Pions are the lightest strongly interacting hadrons known in 
nature. The pions will interact with a very large number of other hadrons in the 
theory to mediate and alter even pure pion-pion scattering. Computing all of these 
interactions with the multitude of other intermediate hadrons is a daunting prospect 
to say the least. However, the effective lagrangian approach allows one to simplify 
these complicated dynamics of higher mass particles interactions into a few low- 
energy parameters of a chiral lagrangian. This technique is described well in many 
places (Donoghue et al. 1992). 

Yet another manifestation of the power of effective theories is Wilson’s discov- 
ery of the renormalization group (Wilson and Kogut 1974; Peskin and Schroeder 
1995). There it was understood in a general way that at low energies all modes 
can be “integrated out” to form an effective lagrangian with renormalization group 
improved parameters. This integration-out procedure was not just hiding the effects 
of heavier particles into non-dynamical lagrangian mass scales, but also resuming 
all the higher momentum mode contributions above a cut-off scale. This technique 
has been extremely powerful in particle physics as both a technically useful tool 
to resum large quantum logarithms, but also as a conceptual tool to understand the 
energy flow of a theory. All modern quantum field theory textbooks, including the 
one listed in Wilson and Kogut (1974); Peskin and Schroeder (1995), have very 
thorough treatments of this most important issue. 

There are numerous other examples of effective theories being employed in the 
particle physics context. All of the theories of physics beyond the Standard Model 
also utilize the concepts in one form or another. The language of effective theory 
concepts is so deeply ingrained in the minds of practitioners now there is rarely need 
to explicit point out or argue for its utility. 

There is, however, one area of particle physics where the notions of effective 
theories are hard to mesh with reality. This is regarding the structure of the vacuum. 
For one, effective theory concepts would tell us that the cosmological constant is 
many orders of magnitude beyond what we observe today. This is usually just ignored 
in the field, with hope that some other quantum-gravity solution as yet not understood 


| The electroweak theory of the Standard Model will be discussed in more detail in Sect. 4.3. 
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will come to save the cosmological constant. I will not talk about this. The second 
place where our experimental understanding of the vacuum may be at odds with 
effective theories is in the generation of elementary particle masses. That will be the 
focus of this chapter. I will first outline the challenges to giving mass to elementary 
particles in chiral theories and then I will give a brief introduction to the Standard 
Model electroweak theory. After that I will describe the effective theory issues for 
the masses of leptons and quarks, neutrinos, and the Higgs boson. The Higgs boson 
is especially interesting since it has likely been discovered lately (Aad et al. 2012; 
Chatrchyan et al. 2012), and the theoretical controversies surrounding why its mass 
is light are very hot today. 


4.2 The Problem of Mass in Chiral Gauge Theories 


The fermions of the Standard Model and some of the gauge bosons have mass. 
This is a troublesome statement since gauge invariance appears to allow neither. 
Let us review the situation for gauge bosons and chiral fermions and introduce the 
Higgs mechanism that solves it. First, we illustrate the concepts with a massive U (1) 
theory—spontaneously broken QED. 


Gauge Boson Mass 


The lagrangian of QED is 
l Wo (iyt 
Zorn = -7 fwF + w(iy" Du =m)y (4.3) 
where 
Dy = Ou + teAy (4.4) 
and Q = —1 is the charge of the electron. This lagrangian respects the U(1) gauge 
symmetry 
por Oy (4.5) 
1 
Au Ap + —dpa(x). (4.6) 
e 


Since QED is a vector-like theory—left-handed electrons have the same charge as 
right-handed electrons—an explicit mass term for the electron does not violate gauge 
invariance. 

If we wish to give the photon a mass we may add to the lagrangian the mass term 


må 
Diss = 5 AnA". (4.7) 


46 4 Effective Theories and Elementary Particle Masses 


However, this term is not gauge invariant since under a transformation A,,A“ 
becomes 


2 1 
A A" > A AV + =A" aya + ipaa (4.8) 
e e 


This is not the right way to proceed if we wish to continue respecting the gauge 
symmetry. There is a satisfactory way to give mass to the photon while retaining the 
gauge symmetry. This is the Higgs mechanism, and the simplest way to implement 
it is via an elementary complex scalar particle that is charged under the symmetry 
and has a vacuum expectation value (vev) that is constant throughout all space and 
time. This is the Higgs boson field ®. 

Let us suppose that the photon in QED has a mass. To see how the Higgs boson 
implements the Higgs mechanism in a gauge invariant manner, we introduce the field 
@ with charge q to the lagrangian: 


L = Lorp + (D E)“ (DHE) — V (8) (4.9) 


where 
V(b) = WID + alol* (4.10) 


where it is assumed that à > 0 and u? < 0. 
Since @ is a complex field we have the freedom to parametrize it as 


Gn. eee (4.11) 
a 


where ġ(x) and &(x) are real scalar fields. The scalar potential with this choice 
simplifies to 


u? 2,444 
V(@) > Vid) = z’ ia a : (4.12) 


Minimizing the scalar potential one finds 


dV |—u? 
Kae = ppo tie =0 => po = E. (4.13) 
do |ġ=¢o A 


This vacuum expectation value of ¢ enables us to normalize the & field by £ /ġo such 
that its kinetic term is canonical at leading order of small fluctuation, legitimizing the 
parametrization of Eq. (4.11). We can now choose the unitary gauge transformation, 
a(x) = —&(x)/¢o, to make @ real-valued everywhere. One finds that the complex 
scalar kinetic terms expand to 


(Du ®)*(D"“®) > 58) + 5a AnA" (4.14) 
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At the minimum of the potential (6) = po, so one can expand the field @ about its 
vev, @ = po +h, and identify the fluctuating degree of freedom h with a propagating 
real scalar boson. 

The Higgs boson mass and self-interactions are obtained by expanding the 
lagrangian about ġo. The result is 


2 f 
My 12 H 43 N14 
— LHiges = 5 he + ar" + ae (4.15) 
where J ‘ 
3m m 
2 2 $ h h 
m; = 2A, w = ——, q= 6 = 3—. (4.16) 


The mass of the Higgs boson is not dictated by gauge couplings here, but rather by 
its self-interaction coupling A and the vev. 
The complex Higgs boson kinetic terms can be expanded to yield 


1 1 
AL = 507 GALAN +q’ hA A" + 507 ALAM, (4.17) 
The first term is the mass of the photon, m4 = eq’ pi. A massive vector boson has a 
longitudinal degree of freedom, in addition to its two transverse degrees of freedom, 
which accounts for the degree of freedom lost by virtue of gauging away (x). The 


second and third terms of Eq.4.17 set the strength of interaction of a single Higgs 
boson and two Higgs bosons to a pair of photons: 


2 
m 
hA,Ay Feynman tule : i2e7q*@ogyv = 2 (4.18) 


2.2 f m3 
hhA, Ay Feynman rule : i2e°q* guv = E (4.19) 
0 
after appropriate symmetry factors are included. 

The general principles to retain from this discussion are first that massive gauge 
bosons can be accomplished in a gauge-invariant way through the Higgs mechanism. 
The Higgs boson that gets a vev breaks whatever symmetries it is charged under—the 
Higgs vev carries charge into the vacuum. And finally, the Higgs boson that gives 
mass to the gauge boson couples to it proportional to the gauge boson mass. 


Chiral Fermion Masses 
In quantum field theory a four-component fermion can be written in its chiral 


basis as 7 
ra L 
y = ( a) (4.20) 


where Wy r are two-component chiral projection fermions. A mass term in quantum 
field theory is equivalent to an interaction between the yz and wr components 
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mb =mvive+t mvp. (4.21) 


In vectorlike QED, the Yz and yr components have the same charge and a mass 
term can simply be written down. However, let us now suppose that in our toy U(1) 
model, there exists a set of chiral fermions where the PY = Wz chiral projection 
carries a different gauge charge than the Pry = wp chiral projection. In that case, 
we cannot write down a simple mass term without explicitly breaking the gauge 
symmetry. 

The resolution to this conundrum of masses for chiral fermions resides in the Higgs 
sector. If the Higgs boson has just the right charge, it can be utilized to give mass to 
the chiral fermions. For example, if the charges” are O[ Wz] = 1, O[Wr] = 1-4 
and Q[®] = q we can form the gauge invariant combination 


Ly = yy Wi Ove tcc. (4.22) 


where yf is a dimensionless Yukawa coupling. Now expand the Higgs boson about 
its vev, W = (ġo + h)//2, and we find 


Ly =my viet (=) hyi yr tee. (4.23) 


where my = yy bo/V2. 

We have successfully generated a mass by virtue of the Yukawa interaction with 
the Higgs boson. That same Yukawa interaction gives rise to an interaction between 
the physical Higgs boson and the fermions: 


haw (Feynman rule) : i (4.24) 
0 


Just as was the case with the gauge bosons, the generation of fermion masses by the 
Higgs boson leads to an interaction of the physical Higgs bosons with the fermion 
proportional to the fermion mass. As we will see in the Standard Model, this rigid 
connection between mass and interaction is what enables us to anticipate Higgs boson 
phenomenology with great precision once the mass is precisely known. 


4.3 Standard Model Electroweak Theory 


The bosonic electroweak lagrangian is an SU(2); x U(1)y gauge invariant theory 


1 1 
bos = |D P|? — PIP — AD — z Buv BM — WaW” (4.25) 


2 We ignore the additional fields that would be needed in order to make the spectrum gauge anomaly 
free. Doing so is straightforward and would not change the message of this example. 
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where @ is an electroweak doublet with Standard Model charges of (2, 1/2) under 
SU(2)_, x U(1)y (Y = +1/2). In our normalization electric charge is Q = T? + i, 
and the doublet field ® can be written as two complex scalar component fields p” 


and o . 
p? j * 


The covariant derivative and field strength tensors are 


° T? a . ,Y 
Dye = Ou +ig—W, tig 5 Bu p (4.27) 
Buy = ðu By — yBy (4.28) 
Wa, = 3 WE — a Wa — gf wE we (4.29) 


The minimum of the potential does not occur at ® = 0 if u? < 0. Instead, one 
finds that the minimum occurs at a non-zero value of —its vacuum expectation 
value (vev)—which via a gauge transformation can always be written as 


err 
w= (9) where ya. (4.30) 


This vev carries hypercharge and weak gauge charge into the vacuum, and what is 
left unbroken is electric charge. This result we anticipated in Eq. (4.26) by defining 
a charge Q in terms of hypercharge and an eigenvalue of the SU (2) generator T°, 
and then writing the field ® in terms of ° and #* of zero and positive +1 definite 
electric charge. 

Our symmetry breaking pattern is then simply SU(2), x U(1)y —> U (1)ọ. The 
original group, SU(2); x U(1)y, has a total of four generators and U(1)g has one 
generator. Thus, three generators are ‘broken’. Goldstone’s theorem (Goldstone et al. 
1962) tells us that for every broken generator of a symmetry there must correspond 
a massless field. These three massless Goldstone bosons we can call 1,2,3. We now 
can rewrite the full Higgs field Ø as 


1 (0 1 (di +ido 
p) = — T= ; 4.31 
are ea iia 
The fourth degree of freedom of ® is the Standard Model Higgs boson h. It is a 
propagating degree of freedom. The other three states $1 7.3 can all be absorbed as 


longitudinal components of three massive vector gauge bosons Z, W~ which are 
defined by 


+ wO 
Wi = aA ) (4.32) 


aer- 
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—g'Z A 
B, = 78 lu t 8Ay (4.33) 
/ g? +g”? 
Z, +g'A 
we = Elu TE Au (4.34) 


It is convenient to define tanw = g’/g. By measuring interactions of the gauge 
bosons with fermions it has been determined experimentally that g = 0.65 and 
g’ = 0.35, and therefore sin? 0w = 0.23. 

After performing the redefinitions of the fields above, the kinetic terms for the 
We» Zu, Ap will all be canonical. Expanding the Higgs field about the vacuum, the 
contributions to the lagrangian involving Higgs boson interaction terms are 


m2 RY? 
Lh int = my Wt we + 5 ZZ" (1 oe 2) (4.35) 
mî, Ezo N 
AR h j (4.36) 
2 3! 4! 
where 
2 1 49 2 l-2 2 miy 2 
my = 78Vv, my= (stg => — =l- sinf bw (4.37) 
4 4 m2, 
3m? 3m? 
m? =2v, E= —, n=6A= oe (4.38) 


From our knowledge of the gauge couplings, the value of the vev v can be determined 
from the masses of the gauge bosons: v ~ 246 GeV. 
The Feynman rules for Higgs boson interactions are 


i3m? 
hhh : -— (4.39) 
v 
3m? 
hhhh : -i= (4.40) 
m2 
hWwiw, : ae (4.41) 
2 
NZ 
hZaZ,: i2—2 guy (4.42) 
+ miy 
AWW, : i2- 8w (4.43) 
2 


na Mz 
hhZ,Z, : (2-5 8w (4.44) 
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Fermion masses are also generated in the Standard Model through the Higgs boson 
vey, which in turn induces an interaction between the physical Higgs boson and the 


fermions. Let us start by looking at b quark interactions. The relevant lagrangian for 
couplings with the Higgs boson is 


AL = yp QÍ dbr +c.c. where o} = Gi bi) (4.45) 


where yp is the Yukawa coupling. The Higgs boson, after a suitable gauge transfor- 
mation, can be written simply as 


1 0 
o=- (9a) (4.46) 


and the interaction lagrangian can be expanded to 


+ Yb te as 0 
AL = yO Obp tcc. = wad HD à, Y? p) irte (4.47) 
$ $ h 7 h 
= mp(bib, + bi bey) (1+2) =m bb(1+ 2 (4.48) 
Vv Vv 


where mp = ypy/V2 is the mass of the b quark. 

The quantum numbers work out perfectly to allow this mass term. See Table 4.1 
for the quantum numbers of the various fields under the Standard Model symmetries. 
Under SU (2) the interaction O} bb R is invariant because 2 x 2 x 1 € 1 contains 
a singlet. And under U (1)y hypercharge the interaction is invariant because Y ot + 


Yo + Yop = =} + 5 — 5 sums to zero. Thus, the interaction is invariant under all 


gauge groups, and we have found a suitable way to give mass to the bottom quark. 

How does this work for giving mass to the top quark? Obviously, QO} tp is not 
invariant. However, we have the freedom to create the conjugate representation of 
® which still transforms as a 2 under SU (2) but switches sign under hypercharge: 
° = io? Ø*. This implies that Ype = -} and 


1 (vt+h 
p= — 4.49 
al 0 ) da 


when restricted to just the real physical Higgs field expansion about the vev. 
Therefore, it becomes clear that YOL D'IR + c.c. is now invariant since the 
SU (2) invariance remains 2 x 2 x 1 € 1 and U(1)y invariance follows from 
Yot + Yoc + Yip = -t — 5 + 5 = 0. Similar to the b quark one obtains an 
expression for the mass and Higgs boson interaction: 
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Table 4.1 Charges of standard model fields 


Y Y 
Field SU(3) SU(2), g z Q=T?+ 5 
Bh (gluons) 8 1 0 0 0 
(Wr. Wh) 1 3 (+1, 0) 0 (+1, 0) 
BY 1 1 0 0 0 
1 2 
Be (ead 2 1 3 
S 3 3 (4) 8 G) 
UR 3 1 0 p 5 
dr 3 1 0 -4 -4 
1 
_(% Z 1 0 
a 1 2 (4) = ©) 
eR 1 1 0 =j -1 
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-i (vth 
AL = y Q} Pte + c.c. = wae bi) (i i ) trtc.c. (4.50) 
$ i h > h 
= m:(tgtL + ttr) ( 1 + 5 =m,tt{1l+ F (4.51) 


where m; = yrv/vV/2 is the mass of the ¢ quark. 

The mass of the charged leptons follows in the same manner, ye E i@e R+ C.C., 
and interactions with the Higgs boson result. In all cased the Feynman diagram for 
Higgs boson interactions with the fermions at leading order is 


ff 2 it. (4.52) 
vV 


We see from this discussion several important points. First, the single Higgs 
boson of the Standard Model can give mass to all Standard Model states, even to the 
neutrinos as we will see in the next section. It did not have to be that way. It could 
have been that quantum numbers of the fermions did not enable just one Higgs boson 
to give mass to everything. This is the Higgs boson miracle of the Standard Model. 
The second thing to keep in mind is that there is a direct connection between the 
Higgs boson giving mass to a particle and it interacting with that particle. We have 
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seen that all interactions are directly proportional to a mass factor. This is why Higgs 
boson phenomenology is completely determined in the Standard Model just from 
the Higgs boson mass. 


4.4 The Special Case of Neutrino Masses 


For many years it was thought that neutrinos might be exactly massless. Although 
recent experiments have shown that this is not the case, the masses of neutrinos are 
extraordinarily light compared to other Standard Model fermions. In this section we 
discuss the basics of neutrino masses (Grossman 2003; De Gouvea 2004; Mohapatra 
2004; Altarelli 2007), with emphasis on how the Higgs boson plays a role. 

Some physicists define the Standard Model without a right-handed neutrino. Thus, 
there is no opportunity to write down a Yukawa interaction of the left and right-handed 
neutrinos with the Higgs boson that gives neutrinos a mass. A higher-dimensional 
operator is needed, 


Rie 
O, = TET (EuS (4.53) 


where Er = (vy eL) is the SU (2) doublet of left-handed neutrino and electron. 
Taking into account the various flavors i = 1, 2, 3 results in a 3 x 3 mass matrix for 


neutrino masses ‘ 


v 
(my)ij = Àij A’ (4.54) 


A can be considered the cutoff of the Standard Model effective theory (see Sect. 4.5), 
and the operator given by Eq. (4.53) is the only gauge-invariant, Lorentz-invariant 
operator that one can write down at the next higher dimension (d = 5) in the theory. 
Thus, it is a satisfactory approach to neutrino physics, leading to an indication of 
new physics beyond the Standard Model at the scale A. For this reason, many view 
the existence of neutrino masses as a signal for physics beyond the Standard Model. 

The absolute value of neutrino masses has not been measured but the differences 
of mass squareds between various neutrino masses have been measured and range 
from about 1075 to 107? eV? (Grossman 2003; De Gouvea 2004; Mohapatra 2004; 
Altarelli 2007; Kayser 2012). It is reasonable therefore to suppose that the largest 
neutrino mass in the theory should be around 0.1 eV. If we assume that this mass 
scale is obtained using the natural value of A ~ 1 in Eq. (4.54) and a large mass scale 
A, this sets the scale of the cutoff A to be 


_, (246 GeV)? 


ey ~ 10!5 GeV (4.55) 
whe 


This is a very interesting scale, since it is within an order of magnitude of where the 
three gauge couplings of the Standard Model come closest to meeting, which may 
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be an indication of grand unification. The scale A could then be connected to this 
Grand Unification scale. 

Another approach to neutrino masses is to assume that there exists a right-handed 
neutrino vr. After all, there is no strong reason to banish this state, especially since 
there is an adequate right-handed partner state to all the other fermions. Furthermore, 
if the above considerations are pointing to a grand unified theory, right-handed neu- 
trinos are generally present in acceptable versions, such as SO(10) where all the 
fermions are in the 16 representation, including vg. Quantum number considerations 
indicate that vp is a pure singlet under the Standard Model gauge symmetries, and 
thus we have a complication in the neutrino mass sector beyond what we encountered 
for the other fermions of the theory. In particular, we are now able to add a Majorana 
mass term vli ove that is invariant all by itself without the need of a Higgs boson. 
The full mass interactions available to the neutrino are now 


M; a 
Ba El, ®°vjr + Tinie ViR + c.c. (4.56) 


The resulting 6 x 6 mass matrix in the {vz , ve} basis is 


_ 0 mp 
m, = (an M ) (4.57) 


where M is the matrix of Majorana masses with values M;; taken straight from 
Eq. (4.56), and mp are the neutrino Dirac mass matrices taken from the Yukawa 
interaction with the Higgs boson 


(mp)ij = we (4.58) 


Consistent with effective field theory ideas, there is no reason why the Majorana 
mass matrix entries should be tied to the weak scale. They should be of order the 
cutoff scale of when the Standard Model is no longer considered complete. Therefore, 
it is reasonable and expected to assume that Mj; entries are generically much greater 
than the weak scale. In that limit, the seesaw matrix of Eq. (4.57) has three heavy 
eigenvalues of O(M), and three light eigenvalues that, to leading order and good 
approximation, are eigenvalues of the 3 x 3 matrix 


2 
miisht = —m),M~'mp ~ yo (4.59) 

M 
which is parametrically of the same form as Eq. (4.54). This is expected since the 
light eigenvalues can be evaluated from the operators left over after integrating out 
the heavy right-handed neutrinos in the effective theory. That operator is simply 
Eq. (4.53), where schematically A can be associated with the scale M, and à can be 

associated with y?. 
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We will emphasize in the next section that the story of neutrino masses conforms 
very nicely with our notions of effective field theories. It is for this reason that most 
physicists are not terribly alarmed about the smallness of neutrino masses, even 
though on the surface it would appear quite disturbing to know that neutrinos are 
orders of magnitude in mass below other particles that we measure very directly at 
colliders. They are 12 orders of magnitude below the top quark mass, for example. 
Nevertheless, there is no concern. 

The role of effective theory becomes much more troublesome to understand in 
the context of Higgs boson physics, even though the Higgs boson mass is in the close 
neighborhood (i.e., less than an order of magnitude difference) of the W, Z, and top 
quark masses. The effective theory issues surrounding the peculiar spin zero Higgs 
boson, the main focus of this chapter that we have been building to, is something we 
come to now. 


4.5 Natural Effective Theories, the Higgs Boson, 
and the Hierarchy Problem 


The Standard Model with its postulated Higgs boson is an unsatisfactory theory for 
many reasons. There are several direct data-driven reasons why it is incomplete. The 
Standard Model has no explanation for the baryon asymmetry of the Universe. For 
some reason there are many more protons than anti-protons, and if the Universe is 
cooling from some primordial hot state with particles in thermal equilibrium, that is 
unexpected. Some mechanism that goes beyond the Standard Model dynamics must 
be at play. Similarly, there is plenty of astrophysical evidence for dark matter in the 
Universe. This dark matter helps to explain structure formation, details of the cosmic 
microwave background radiation, galactic rotation curves, etc. The problem is the 
Standard Model has no candidate explanation, and new physics must be invoked. 

There are many other reasons to consider physics beyond the Standard Model. The 
three gauge forces could be unified and the matter unified within representations of a 
grand unified symmetry. The many different parameters of the flavor sector are hard 
to swallow without envisaging deeper principles that organize them. Furthermore, 
the integration of the Standard Model with quantum gravity is not obvious, and many 
think a deeper structure, such as that built from strings and branes, is needed for their 
coexistence. 

So, there are many reasons to believe that there is physics beyond the Standard 
Model. But the issue that is front and center for us now, relevant to Higgs boson 
physics and electroweak explorations at the Large Hadron Collider, is the Hierarchy 
Problem. The Hierarchy Problem is often expressed as a question: Why is the weak 
scale (~10* GeV) so much lighter than the Planck scale (~10!8 GeV)? It is a bit 
uninspiring when phrased this way, since it begs the question of why we should be 
concerned at all about a big difference in scales. Blue whales are much bigger than 
nanoarchaeum equitans but we do not believe nature must reveal a dramatic new 
concept for us to understand it (Clauset 2012). 
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A knowing-just-enough-to-be-dangerous naive way to look at the Standard Model 
is that it is the “Theory of Particles”, valid up to some out-of-reach scale where 
gravity might go strong, or some other violence is occurring that we do not care 
about. It is a renormalizable theory. I can compute everything at multiple quantum 
loop order, set counter terms, cancel infinities that are fake since they do not show 
up in observables, and then make predictions for observables that experiment agrees 
with. Quadratic divergences of the Higgs boson self-energy, which so many people 
make a fuss about, are not even there if I use dimensional regularization. The theory is 
happy, healthy, stable, and in no need of any fixes. New physics near the electroweak 
scale can still be justified (Wells 2003, 2005; Arkani-Hamed and Dimopoulos 2005; 
Giudice and Romanino 2004; Arkani-Hamed et al. 2005) after dismissing naturalness 
as impossibly imprecise to understand at this stage, but the urgency is certainly 
diminished for it being at the electroweak scale. 

This viewpoint that the Standard Model is complete can be challenged right at 
the outset. It is simply not the “Theory of Particles’—it does break down. It is an 
effective theory, even if one thinks there is a way to argue it being valid to some 
very remote high scale where gravity goes strong, such as Mp;. As an effective 
theory, all operators should have their dimensionality set by the cutoff of the theory 
(Polchinski 1992). If operator @ (4) has dimension d then its coefficient is cA*~@, 
where A is the cutoff of the theory and c is expected to be ~1 in value. Irrelevant 
operators with d > 4 cause no harm. Same goes for d = 4 marginal operators. The 
Standard Model is almost exclusively a theory of d = 4 marginal operators with 
its kinetic terms, gauge interaction terms, and Yukawa interaction terms. What is 
potentially problematic is the existence of any d < 4 relevant operators. In that case, 
the coefficients should be large, set by the cutoff of the theory. 

Does the Standard Model have any gauge-invariant, Lorentz-invariant relevant 
d <4 operators to worry about? Yes, two of them. The right-handed neutrino Majo- 
rana mass interaction terms vfi a?vpr, which is d = 3, and the Higgs boson mass 
operator | H|?, which is d = 2. The expectations of effective field theories is that the 
scale of the coefficients of these operators should be set by high-scale cutoffs of the 
theory and disconnected from any other surviving mass scale in the infrared. As we 
saw in Sect. 4.4 this expectation is nicely met in the neutrino case, where we have 
actually measured the masses and see a self-consistent picture for large Majorana 
masses for the right-handed neutrinos, which serve as cutoff scale coefficients. These 
coefficients are tied to lepton number violation, for example, and not electroweak 
symmetry breaking, and therefore have naturally large values above the weak scale. 

It did not have to be that way with neutrino physics. It could have been that the 
neutrino sector was shown experimentally to have independent left and right-handed 
components and the masses were of order the weak scale. This would have been in 
violation of effective field theory expectations, unless new symmetries tied to the 
weak scale were discovered to protect the right-handed neutrino from getting a large 
Majorana mass. The fact that the neutrino sector conforms with effective field theory 
expectations should be viewed as contributing evidence for these concepts. 

In contrast to the neutrino operator, the d = 2 Higgs mass operator in the Standard 
Model is unwelcome if its coefficient is not set to the weak scale. From our effective 
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field theory expectations, the Lagrangian operator should be 
AP p= ch Al, (4.60) 


This is a potential disaster for the theory, since from our previous work on the Higgs 
potential we stated that the Higgs mass must be —u? ~ v*, where v ~ 246 GeV 
is the Higgs boson vacuum expectation value needed to reproduce the W and Z 
masses. If we assume the Standard Model to be a valid theory to very high energies 
E > v, that implies the cutoff of the Standard Model effective theory is A >> v, which 
“incorrectly” implies the coefficient of | H|? is |u| = A? >> v?. The effective theory 
would then need the coefficient c in Eq. (4.60) to be finetuned to an extraordinarily 
small and unnatural (Giudice 2004) value c ~ v? /4? to make all the scales work out 
properly. The concern about how this can be so is the Hierarchy Problem. 

The discussion is a bit abstract, but it bears fruit with direct computations. As 
one example out of an infinite number that would demonstrate the Hierarchy Prob- 
lem, consider the possible existence of other scalar fields ¢; at higher energies. The 
assumption is that if there is a Higgs boson in the theory, then there is every reason to 
believe that there can be other scalars. They can have mass at the weak scale, inter- 
mediate scale, Planck scale, wherever. Let us suppose that we put one ¢ at the cutoff 
scale A of the theory. The operator |ġ|?| H|? immediately gives a quantum correc- 
tion to the Higgs mass operator coefficient of ~ A*/16z7. Although the 1/167? can 
help a little, if A >> 4:rv there is serious problem, and the weak scale cannot exist 
naturally with such a hierarchy. For this reason, it is often assumed that naturalness of 
the Higgs boson sector of the Standard Model effective theory requires new physics 
to show up at some scale below A ~ 4zv ~ few TeV. 

There are many different approaches to solving the Hierarchy Problem. One 
approach suggests that there is new physics at the TeV scale and the cutoff A in 
Eq. (4.60) is in the neighborhood of the weak scale. Supersymmetry (Martin 1997), 
little Higgs (Schmaltz and Smith 2005), conformal theories (Frampton and Vafa 
1999), and extra dimensions (Sundrum 2005; Rattazzi 2006) can be employed in 
this approach. For example, supersymmetry accomplishes the task by a softly bro- 
ken symmetry, where A is the supersymmetry breaking mass scale. All quadratic 
divergences to the Higgs boson mass operator cancel up to supersymmetry breaking 
terms. Extra dimensions accomplishes it by banishing all mass scales accessible to 
the Higgs boson above the TeV scale. Another approach suggests that fundamental 
scalars are banished from the theory that could form invariant |p|? operators. For 
example, this is the approach of Technicolor (Lane and Martin 2009) and top-quark 
condensate theories (Hill 1991, 1995; Martin 1997; Chivukula et al. 1999) that try to 
reproduce the symmetry breaking of a Higgs boson with the condensate of a fermion 
bilinear operator. Higgsless theories and their variants are also in this category (Csaki 
et al. 2004a,b; Cui et al. 2009). These theories are obviously less interesting given 
the discovery of a Higgs-like boson, but it is extraordinarily difficult, and perhaps 
impossible, to resolve whether the Higgs boson is a fundamental scalar or merely 
a composite particle acting like a scalar. Also, theories with no true Higgs boson 
can have another particle—a dilation, for example—that acts like a Higgs boson. 
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Therefore, these theories still have life within them, and more data is required to 
gain confidence in these alternative explanations or rule them out. 

Nevertheless, the least complicated thoughts suggest to us that a simple Higgs 
boson has been discovered with mass of approximately 126 GeV (Aad et al. 2012; 
Chatrchyan et al. 2012). Of course, there is no certainty that it is the SM Higgs 
boson. Indeed, such certainty is likely to never exist, but measurements at the LHC 
can likely give us confidence that its couplings are within 20% of the values that 
the SM Higgs boson would have. Next-generation colliders, such as an et e~ linear 
collider, would be able to further refine this to percent level, or perhaps even show 
that there are small deviations from SM expectations. In any case, it is legitimate 
to call it “a Higgs boson” since it appears to be coupling to the vector boson and 
fermions according to their mass values, and that puts an added confidence that the 
particle is associated with mass generation. Again, metaphysical certainty into the 
nature of any particle will always be out of the question, but the evidence is accruing 
and the words “for all practical purposes” are just around the corner. 

This has been a major achievement by humans. The historical theory development 
that culminated in a highly speculative prediction for a new Higgs boson that turned 
out to be there is just one aspect of this achievement. There is also the decades of work 
and expertise built up to invent and apply experimental techniques that discovered 
the boson. This is not to mention the impressive human resource management skills 
needed to herd all the people together in a collective effort to divide tasks and construct 
the coherent whole—the discovery. 

The smugness we may feel for the discovery of the Higgs boson is to be tempered 
with the stark truth that nothing else has been found at the LHC at this time. If it 
continues this way it means that many predictions, influenced by concepts of effective 
theories, were wrong that insisted that the Higgs boson needed an entourage of other 
particles very close by in mass to tame its quantum instabilities. Maybe they were 
only wrong quantitatively, and new particles and dynamics are around the corner to 
vindicate effective theories. 

Or perhaps there is yet another factor that is overriding our effective theory intu- 
itions. Perhaps there is a multiverse where the solution to the Hierarchy Problem 
suggests that large statistics of finetuned solutions dominate over the fewer num- 
ber of non-tuned solutions in the landscape, leading to a higher probability of our 
Universe landing in a highly tuned solution (c < 1). Thus, guided by concerns over 
the cosmological constant problem, it has been suggested that this statistical, stringy 
naturalness over the landscape may take precedence over normal naturalness envi- 
sioned from effective field theories (Douglas 2007; Kumar 2006). Although not 
directly related to external particle physics interactions, the cosmological constant 
can be considered as the coefficient of yet another gauge-invariant, Lorentz-invariant 
operator—the operator being merely a constant: —%. = A4.. The tiny value of this 
coefficient, A4, ~ (107? eV), is well below any conceivable theory expectation. It 
is the elephant in the room for effective field theories. However, it is an unexpressed 
article of faith among most particle physicists that the solution to the Cosmological 
Constant Problem lies in the details of mysterious quantum gravity, and that the 
new concepts buried in that unknown solution do not materially affect the natural 
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solution to the Hierarchy Problem. Landscapists question that assumption. This is 
controversial with conflicting claims over unrealistic theories; nevertheless, it is an 
interesting idea that might one day be impactful. 

Data keeps coming, and searches for new particles that would vindicate our most 
basic notions of effective field and naturalness continue. Many “good ideas” are now 
dead after years of data have found no evidences for them. There is no theorem that 
we will have full resolution to all the “good ideas” within our lifetimes, or that any of 
the colliders we are running or contemplating in the future will have enough energy 
or luminosity or precision to give a final say on the matter. Nevertheless, the field 
carries on and the tree of various interpretations of what has been seen and what has 
not been seen grows branches, flowers, and surely will bear fruit again. 
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Chapter 5 
Effective Theories and Theory Choice 


Abstract Promoting a theory with a finite number of terms into an effective 
field theory with an infinite number of terms worsens simplicity, predictability, 
falsifiability, and other attributes often favored in theory choice. However, the impor- 
tance of these attributes pales in comparison with consistency, both observational and 
mathematical consistency, which propels the effective theory to be superior to its sim- 
pler truncated version of finite terms, whether that theory be renormalizable (e.g., 
Standard Model of particle physics) or unrenormalizable (e.g., gravity). Some impli- 
cations for the Large Hadron Collider and beyond are discussed, including comments 
on how directly acknowledging the preeminence of consistency can affect future the- 
ory work. 


5.1 Introduction 


One of the most interesting questions in philosophy of science is how to determine 
the quality of a theory. Given the data, how can we infer a “best explanation” for the 
data. This often goes by the name “Inference to Best Explanation” (IBE) (Harman 
1965; Lipton 1991; Clayton 1997). The wide variety of claims for important criteria 
are a measure of how difficult it is to come up with a clear and general algorithm 
for choosing between theories. Some claim even that it is intrinsically not possible 
to come up with a methodology of deciding (Lehrer 1974; Newton-Smith 1981). 
Nevertheless the goals of IBE are worthy, and the payoff is high upon increased 
understanding, if for no other reason than the extraphilosophical importance of dis- 
tributing grant money more fairly to researchers. Furthermore, whether objective 
criteria for IBE are possible, all practitioners of science have no choice but to engage 
in the “infer” part even if they may never touch upon the “best explanation” part of 
IBE. 

The goal of this chapter is to survey theory choice criteria in the context of effective 
theories. It has been accepted by the physics communities that theories must be 
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“effectified”’, that is they must be augmented to include all possible interactions 
consistent with the stated symmetries to all orders. On the surface the resulting 
effective theories are in conflict with the rules of IBE, whether they be the murky 
rules that some physicists put forward when they talk about theory choice, or the 
precisely stated rules developed by philosophers. Upon closer inspection effective 
theories rise quickly to the top in theory choice when admitting to the primacy of 
consistency in theory choice. That is the claim, to be developed below. The reader 
should be warned that I will use the acronym IBE to mean any attitude, theory, system 
by which people decide that one theory is a better description of nature than another, 
or that a theory under consideration is a good theory at all. 


5.2 The Standard Model’s Triumphs and Woes 


This chapter is primarily written from the science perspective of elementary particle 
theory, with particular emphasis on the subfield “beyond the Standard Model 
physics’. In this subfield, the task is to look out over nature and ask what is not 
adequately described by the Standard Model (SM) of particle physics. The Standard 
Model has been with us for about 40 years. It consists of three families of up-type 
quarks (u, c, t), three families of down-type quarks (d, s, b), three families of leptons 
(e, u, T) and three families of neutrinos (ve, Vu, Vr). These interact with each other 
according to gauge field theory interactions, mediated by the force carrier bosons 
of the photon, gluons and W and Z bosons. Every particle that has mass is said 
to achieve it by a condensing Higgs boson. For a more complete non-technical or 
technical description of the SM see references Kane (1996) and Griffiths (2008), 
respectively. 

The SM is a renormalizable theory and can be fully described on one page using 
standard nomenclature of mathematics and relativistic quantum field theory. Despite 
that simplicity, it can account for every measurement ever made at high-energy col- 
liders. It is an enormous human achievement. So why are there researchers searching 
for theories “beyond the SM”? There are many reasons, of which I will name a few: 


There are non-collider observations we still cannot explain such as galactic rotation 
measurements that imply the existence of dark matter, and the preponderance of 
baryons over anti-baryons in the universe. 

The particle content and the three gauge forces cry out for unification (e.g., grand 
unified theories). 

There are many of parameters with large hierarchies that beg for explanation 
(m;/me > 10°). 

The SM Higgs boson appears unstable to quantum corrections and is thus 
unnatural. 

Surely there is more than just the SM (e.g., SM is just copies of stuff in our bodies). 
Embedding gravity into quantum mechanics is a severe challenge and should bring 
new implications to the particle physics world (e.g., string theory). 
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Thus, there are many opportunities to devise new theories that solve one or more 
of these problems. The theories are necessarily speculative upon birth. They are put 
to the test, and the simple fact is that at any given moment there are a multitude of 
theories that appear to be able to solve one or more of the issues. We have many 
variants of supersymmetric theories, strongly coupled theories, extra dimensional 
theories, etc. that appear to be able to do the job and are not yet distinguishable by 
currently known data. How does a scientist determine which is the best? The rules 
are not clear, of course, and we shall first ask how do scientists make theory choices, 
and when does IBE enter their calculus. 


5.3 Theory Choice Among Practitioners 


Typically a particle physicist will look at the SM problems listed above and set out 
to construct a new theory that explains one or more of them. The particle researcher 
often stumbles into a theory choice of what to work on not based on IBE but rather 
DBO (deduction of best opportunities). The opportunities that arise may include 
matching yourself with the best PhD advisor who is working on theory X, research- 
ing a fashionable topic to get a good job, supporting a clever theory that the researcher 
devised that might not have high probability of being correct but has highest prob- 
ability of enormous personal pay-off, etc. The last reason then circles back on the 
first reason as advisors ask their students to work on theories that they themselves 
devised. Furthermore, the subtleties of elementary particle physics and beyond the 
SM theories are such that it could take new practitioners years before they feel con- 
fident that they could make a reliable IBE estimate even if the criteria for such were 
clear to them. Thus, IBE considerations are often not the dominant force for their 
theory choice (i.e., what to work on) in a practicing scientist’s career. 

IBE issues do arise when there is competition among researchers for journal 
space, research funds, and conference time slots. IBE-like arguments ensue. Words 
used to describe the evaluation of theories are familiar to philosophers: simplicity, 
economy, calculability, compatibility with data, testability, falsifiability, naturalness, 
finetuning, predictivity, unification, no ad hoc assumptions, etc. Researchers become 
attorneys for their theories and weight the various IBE criteria which most favor- 
ably supports the direction of their research lines. That is why experimentalists 
and phenomenologists emphasize “falsifiability” and “observational consistency” 
much more than string theorist, who emphasize “unification”, “completeness” and 
“mathematical consistency”. 

It is often said at the end of arguments between theorists about their pet theories 
that “experiment will decide”. However, as experiments become larger and costs 
grow steeply, the time frame may extend well past decades to even centuries. It took 
more than 25 years for CERN to conceive and build the LHC, for example. There is no 
guarantee that any timeline convenient to a human is relevant to future experimental 
construction. However, what is relevant to human time scales is deciding what are 
“better” or “best theories’, since we should use that to allocate resources of time, 
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money, etc. Working toward perfecting IBE criteria, no matter how controversial 
they are, is clearly warranted. 

The recent universal acclaim of effective theories gives us an opportunity to apply 
IBE thinking to a case that is not controversial as a means to better understand the 
weight that should be given to various elements of IBE. In the next section, I will 
describe how the effective SM is different from the SM, and then we shall survey 
their IBE qualities, with an eye toward gaining insight along the way. 


5.4 The Standard Model Versus the Effective 
Standard Model 


The SM has a finite number of operators of dimension four or less. The effec- 
tive SM (ESM) is the SM but with all possible higher dimensional operators 
present consistent with the sacrosanct symmetries of the SM: Lorentz symmetry and 
SU(3)¢ x SU (2)z x U(1)y gauge symmetries. Thus we can relate the lagrangians 
of the two by the equation 


ow 
Lesu = Lsu + > mi a (5.1) 
ni 
where 0. Saia is the collection of all operators of higher dimension 4+ n that respect 


the symmetries of the SM and have unknown couplings nn,i/ A” in front. 

The SM matches all observed high-energy collider data to excellent compatibility. 
There can be no additional operators that would improve the fit by a meaningful 
amount. Furthermore, if any of the higher-dimensional operators of -gsm become 
worrisome with respect to the data, we need merely tune down the strength of the 
interaction by making its associated 7,,; coupling smaller, to escape the problem. 

Which theory is better, the ESM or SM, given that they both can be made equally 
compatible with the data? To answer this question, let us first apply some of the 
IBE thinking common in the particle physics community. Our example source for a 
typical particle physicist approach to these issues will be the essay written by Nobel 
Laureate Burton Richter (2006). We shall also attempt to answer the question using 
the criteria of the philosopher Paul Thagard (1978), whose paper is still considered 
one of the key early expositions on theory choice criteria for IBE. 


5.5 Richter’s IBE Criteria 


There are not many official forums through which practicing particle physicists are 
encouraged to divulge their IBE criteria. But one forum where it regularly happens, 
both in essays and in letters to the editor, is in professional society monthly notices. 
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One of the most talked about articles of this kind in recent years was written by 
Burton Richter (2006). Richter and Sam Ting won the Nobel prize of physics in 
1976 for finding the J/w particle, which was a key discovery in establishing the SM. 

Richter has been horrified by what he views are “major problems in the philosophy 
behind theory” research. He says, 


Simply put, most of what currently passes as the most advanced theory looks to be more 
theological speculation, the development of models with no testable consequences, than it 
is the development of practical knowledge, the development of models with testable and 
falsifiable consequences (Karl Popper’s definition of science). 


Richter goes on to say that more weight should be put on 


1. theories that have testable and falsifiable consequences, and 
2. theories that simplify rather than increase complication. 


Incidentally, he also discusses two anti-criteria that should not be used, which are the 
anthropic principle and naturalness. Let us not discuss these anti-criteria, but rather 
judge the SM versus ESM based on what Richter would have us do, on falsifiability 
and simplicity. 

Regarding falsifiability and testable consequences, an argument can be made that 
the SM wins. The ESM has an infinite number of operators with coefficients to 
be pinned down by data later, and as such can accommodate more experimental 
outcomes the SM. After all, the ESM reduces to SM when A” —> oo. Thus, the SM 
is much more testable and falsifiable than ESM. 

Although not central to the subsequent discussion, I would like to remark that fal- 
sifiability has never struck me as strong argument for theory deciding for two reasons. 
Skepticism toward falsifiability has long been held in the philosophy community, but 
let me give what I think are two strong reasons to worry about its applicability. Let’s 
take an example of an unfalsifiable theory: Theory X says that obvious fact Y is true 
(e.g., emeralds are green, or something trivially true like that), and that angels live 
in another universe. We can use this silly theory to illustrate why falsifiability is not 
a very solid criteria. 

First, the modularity of the theory can be under dispute, such as the more testable 
first statement versus the second statement. Second, if things change dramatically 
such that what was true yesterday is not true tomorrow (tomorrow Y is false), then the 
theory is trivially invalidated. Does that make it falsifiable? In that case, all theories 
are falsifiable by scattering the word “always” through-out its description. And third, 
it is never clear if falsifiability must be applicable in principle or in practice. In 
principle perhaps everything is falsifiable (e.g., many versions of string theory—just 
run a collider at 10!° GeV), whereas in practice good theories might not be (perhaps: 
string theory, high scale warped extra dimensions, etc.) because of lack of money, 
technology, time, or manpower to test it. 

In short, you can like it, you can hope for it, you can wish for it, you can say it 
would make our lives as scientists much easier if so, but it would presumptuous of us 
to say that Nature cares one whit if we can falsify a true statement. Nevertheless, we 
shall take it seriously because we are investigating somebody else’s criteria, which 
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happen to be shared by many others. And as we have noted above, the testable and 
falsifiability criteria favors SM over ESM. 

Regarding simplicity, the SM has a finite number of terms with a finite number 
of coefficients, whereas the ESM has an infinite number of terms with an infinite 
number of coefficients. No contest, SM wins in the simplicity category. 

According to Richter’s two key criteria, falsifiability and simplicity, the SM is the 
winner, and we infer it to be the best theory. 


5.6 Thagard’s IBE Criteria 


The philosophy literature is vast on the subject. Surveying it with sweeping scope 
would not be enlightening and picking just one approach to compare leaves one 
wanting. Nevertheless, I will do the latter, choosing a classic paper on the subject by 
Paul Thagard (1978). 

Thagard’s theory choice criteria are 


1. Consilience: The measure of how many facts the theory explains; furthermore, 
“a consilient theory unifies and systematizes.” 

2. Simplicity: The quality of having the fewest “auxiliary hypothesis,” fewest ad 
hoc additions, and most ontological economy. 

3. Analogy: Shared characteristics between two theories, leads to one theory admit- 
ting a new characteristic if the new characteristic is part of the other theory and 
explains the shared characteristics there. 


Regarding consilience, it is a draw between SM and ESM. The facts are equally 
compatible in the two theories, and there is no relevant advantage in either in the 
realm of unification and systematizing. 

Regarding simplicity, although it is not exactly the kind of simplicity that Richter 
was talking about, the SM clearly is superior to the ESM in this category. The new 
operators of the ESM simply add more. 

Regarding analogy, it is my view that the ESM wins out over the SM. I will 
explain why twice. First, I will explain it here strictly in the language of Thagard’s 
analogy propositions. I will explain it a second time later heuristically using particle 
physics language from Steven Weinberg. 

As Thagard explains, by analogy he does not mean the standard syllogism 

Ais P,Q,R,S 
Bis P,Q, R 
Thus, B is S. 
No, he means something more causally connected: 
Ais P,Q,R,S 
Bis P,Q, R 
If S explains P, Q, R in A, then B is S. 
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I will follow this analogy criteria by first defining 
A =chiral lagrangian of pion scattering 
B = Effective Standard Model (ESM). 
The chiral lagrangian of pion scattering is 


v? byt v? byt 2 
Za = LT r(3,Ua"U") + 5 [Tr(avarut)| +--- (5.2) 


where 
U = exp(it - m/v"). (5.3) 


This lagrangian has an infinite number of terms respecting the underlying SU (2) 
custodial symmetries. The pions are the z fields in U, and v is the vacuum expec- 
tation breaking of the custodial symmetry SU (2); x SU(2)p to its SU(2)y vector 
subgroup. All the interactions of the pions are contained within these terms. As the 
energy increases the higher order terms in the lagrangian become more important, 
and the data can be accommodated. This theory was very useful. It was determined 
that all the higher order corrections needed to be there, although a deep appreciation 
of why was not to come until Ken Wilson’s renormalization breakthroughs years 
later. 

Now, the shared properties P, Q, R of the chiral lagrangian theory of pion scat- 
tering and the ESM are 


P, Q, R = quantum field theory, perturbative expansion theory, all lowest dimensionality 
terms allowed by symmetries of the theory are present, finite number of terms relevant in 
deep infrared, etc. 


and the new characteristic S in theory A is 


S = all operators consistent with the symmetries are present. 


S explains P, Q, R because relevant terms are a subset of “all operators”. 

Thus, by Thagard’s analogy we would say that the SM should be augmented by 
all possible terms consistent with its symmetries —> ESM. The argument is further 
strengthened later when we catch Weinberg directly using the language of analogy 
to support the generalization of effective field theory techniques to the SM. 

The result of our analysis based on Thagard’s IBE criteria is SM +1, ESM +1, 
and Draw +1. No clear resolution to be found here. 


5.7 Non-negotiable Attributes of a Best Explanation 


What is lacking in our discussion of IBE criteria is a rank ordering of attributes. 
We must first ask ourselves what is non-negotiable. Falsifiability is clearly some- 
thing that can be haggled over. Simplicity is subject to definitional uncertainty, and 
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furthermore has no universally accepted claim to preeminence. Naturalness, calcu- 
lability, unifying ability, predictivity, etc. are also subject to preeminence doubts. 

What is non-negotiable is consistency. A theory shown definitively to be incon- 
sistent does not live another day. It might have its utility, such as Newton’s theory of 
gravity for crude approximate calculations, but nobody would ever say it is a better 
theory than Einstein’s theory of General Relativity.! 

Consistency has two key parts to it. The first is that what can and has been computed 
must be consistent with all known observational facts. As Murray Gell-Mann said 
about his early graduate student years, “Suddenly, I understood the main function 
of the theoretician: not to impress the professors in the front row but to agree with 
observation (Gell-Mann 1994).” Experimentalists of course would not disagree with 
this non-negotiable requirement of observational consistency. If you cannot match 
the data what are you doing, they would say? 

However, theorists have a more nuanced approach to establishing observational 
consistency. They often do not spend the time to investigate all the consequences 
of their theories. Others do not want to “mop up” someone else’s theory, so they 
are not going to investigate it either. We often get into a situation of a new theory 
being proposed that solves one problem, but looks like it might create dozens of 
other incompatibilities with the data but nobody wants to be bothered to compute it. 
Furthermore, the implications might be extremely difficult to compute. 

Sometimes there must be suspended judgment in the competition between excel- 
lent theories and observational consequences. Lord Kelvin claimed Darwin’s evolu- 
tion ideas could not be right because the sun could not burn long enough to enable 
long-term evolution over millions of years that Darwin knew was needed. Darwin 
rightly ignored such arguments, deciding to stay on the side of geologists who said 
the earth appeared to be millions of years old (Gavin et al. 2008). Of course we know 
now that Kelvin made a bad inference because he did not know about the fusion 
source of burning within the sun that could sustain its heat output for billions of 
years. 

A second part to consistency is mathematical consistency. There are numerous 
examples in the literature of subtle mathematical consistency issues that need to be 
understood in a theory. Massive gauge theories looked inconsistent for years until 
the Higgs mechanism was understood. Some gauge theories you can dream up are 
“anomalous” and inconsistent. Some forms of string theory are inconsistent unless 
there are extra spatial dimensions. Extra time dimensions appear to violate causality, 
even when one tries to demand it from the outset, thereby rendering the theory 
inconsistent. Theories with ghosts, which may not be obvious upon first inspection, 
give negative probabilities of scattering. 

Mathematical consistency is subtle and hard at times, and like observational 
consistency there is no theorem that says that it can be established to comfortable 


' The word “better” in this context can induce apoplectic shocks in pedants. To avoid that, by 
“better” I wish to say that it is closer to the true, underlying theory, whatever that may mean or be. 
I do not wish it to mean “better to calculate a hammer fall on the moon in under three lines for 
primary school children”, or any other similar appeal to convenience or simplicity. 
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levels by theorists on time scales convenient to humans. Sometimes the inconsistency 
is too subtle for the scientists to see right off. Other times the calculability of the 
mathematical consistency question is too difficult to give definitive answer and it 
is a “coin flip” whether the theory is ultimately consistent or not. For example, 
pseudomoduli potentials that could cause a runaway problem are incalculable in some 
interesting dynamically broken supersymmetric theories (Intriligator et al. 2009). 

It is not controversial that observational consistency and mathematical consis- 
tency are non-negotiable; however, the due diligence given to them in theory choice 
is often lacking. The establishment of observational consistency or mathematical 
consistency can remain in an embryonic state for years while research dollars flow 
and other IBE criteria become more motivational factors in research and inquiry, and 
the consistency issues become taken for granted. 

This is one of the themes of Gerard ‘t Hooft’s essay “Can there be physicist without 
experiments?” (Hooft 2001). He reminds the reader that some of the grandest theories 
are investigations of the nature of spacetime at the Planck scale, which is many orders 
of magnitude beyond where we currently have direct experimental probes. If this is to 
continue as a physics enterprise it “may imply that we should insist on much higher 
demands of logical and mathematical rigour than before.” Despite the weakness of 
verb tense employed, it is an incontestable point. It is in these Planckian theories, 
such as string theory and loop quantum gravity, where the lack of consistency rigor 
is so plainly unacceptable. However, the cancer of lax attention to consistency can 
spread fast in an environment where theories and theorists are féted before vetted. 


5.8 Effective Field Theories and Consistency 


Let us begin with the claim at the heart of our discussion. The claim behind the 
ascendancy of effective theories is that unless there is good and explicit reason 
otherwise, consistency requires that a theory have all possible interactions consistent 
with its symmetries at every order. 

The claim has its origins in the work of Wilson, whose original review article 
with Kogut (Wilson and Kogut 1974) is a classic. There are many modern reviews 
of effective theories that make or assume the above claim (Polchinski 1992; Cohen 
1993; Rothstein 2003). Weinberg’s recent historical perspective (Weinberg 1964) 
gives an excellent summary of what was learned: 


I was struck [at Erice school in 1976] by Kenneth Wilson’s device of “integrating out” 
short-distance degrees of freedom by introducing a variable ultraviolet cutoff, with the bare 
couplings given a cutoff dependence that guaranteed that physical quantities are cutoff inde- 
pendent. Even if the underlying theory is renormalizable, once a finite cutoff is introduced 
it becomes necessary to introduce every possible interaction, renormalizable or not, to keep 
physics strictly cutoff independent.... Indeed, I realized that even without a cutoff, as long 
as every term allowed by symmetries is included in the Lagrangian, there will always be 
counterterm available to absorb every possible ultraviolet divergence... 
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Therefore, consistency of the theory—the absorption of ultraviolet divergences, the 
maintaining of independence of arbitrary ultraviolet scale cutoff, etc—requires the 
introduction of all possible terms allowed by the symmetries. 

The issue of consistency then becomes front and center, and the issues of simplicity 
and testability fade in importance. From our discussion above we know that without 
this important issue of consistency, the effective SM may not win in a theory choice 
competition compared to the SM with just its renormalizable operators, since it 
worsens the otherwise positive features of simplicity and testability. Therefore, the 
establishment of rigorous consistency requirements on the theory were crucial in the 
decision. 


5.9 Relation to Thagard’s Analogy Criterion 


I would like to take a quick aside and show that physicists do reason in real-life, 
complex theory circumstances through the analogy criterion of Thagard. Indeed, it 
is a separate argument for the general applicability of effective theories. 

In the same historical review article (Weinberg 1964) quoted above, Weinberg 
shows that because effective field theory ideas were necessary in chiral dynamics 
(low-energy pion scattering), the concept should also apply to the SM. Here is a 
relevant quote: 


Perhaps the most important lesson from chiral dynamics was that we should keep an open 
mind about renormalizability. The renormalizable Standard Model of elementary particles 
may itself be just the first term in an effective field theory that contains every possible 
interaction allowed by Lorentz invariance and the SU (3) x SU (2) x U(1) gauge symmetry, 
only with the non-renormalizable terms suppressed by negative powers of some very large 
mass M, just as the terms in chiral dynamics with more derivatives ... are suppressed by 
negative powers of 27 Fy ~ my. 


One should note the usage of analogy language: “most important lesson from chiral 
dynamics” and “just as in the terms in chiral dynamics”. Thus, the syllogistic rep- 
resentations given in Sect.5.6 are shown to apply and be part of theory choice for 
particle physicists. 


5.10 Summary: The Preeminence of Consistency 


I will conclude by stating my two central points that generalize the discussion we 
have had above in comparing the effective SM with the SM. 

My first point is that the conditions of theory choice should be ordered. Frequently 
we see the listing of criteria for theory choice given in a flat manner, where one is 
not given precedence over the other a priori. We see consilience, simplicity, falsifia- 
bility, naturalness, consistency, economy, all together in an unordered list of factors 
when judging a theory. However, consistency must take precedence over any other 
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factors. Observational consistency is obviously central to everyone, most especially 
our experimental colleagues, when judging the relevance of theory for describing 
nature. Despite some subtleties that can be present with regards to observational 
consistency? it is a criterion that all would say is at the top of the list. 

Mathematical consistency, on the other hand, is not as fully appreciated. In 
Richter’s essay excoriating theorists he did not appear to recognize or acknowl- 
edge the central role that mathematical consistency plays in developing and vetting 
theories. Mathematical consistency has a preeminent role right up there with obser- 
vational consistency, and can be just as subtle, time-consuming and difficult to estab- 
lish. We have seen that in the case of effective theories it trumps other theory choice 
considerations such as simpleness, predictivity, testability, etc. 

My second point builds on the first. Since consistency is preeminent, it must have 
highest priority of establishment compared to other conditions. Deep, thoughtful 
reflection and work to establish the underlying self-consistency of a theory takes 
precedence over finding ways to make it more natural or to have less parameters (i.e., 
simple). Highest priority must equally go into understanding all of its observational 
implications. A theory should not be able to get away with being fuzzy on either of 
these two counts, before the higher order issues of simplicity and naturalness and 
economy take center stage. That this effort might take considerable time and effort 
should not be correlated with a theory’s value, just as it is not a theory’s fault if it 
takes humans decades to build a collider to sufficiently high energy and luminosity 
to test it. 

Additionally, dedicated effort on mathematical consistency of the theory, or class 
of theories, can have enormous payoffs in helping us understand and interpret the 
implications of various theory proposals and data in broad terms. An excellent exam- 
ple of that in recent years is by Adams et al. (2006), who showed that some theories 
in the infrared with a cutoff cannot be self-consistently embedded in an ultraviolet 
complete theory without violating standard assumptions regarding superluminality 
or causality. 


5.11 Implications for the LHC and Beyond 


Finally, I would like to make a comment about the implications of this discussion 
for the LHC and other colliders that may come in the future. First, it is obvious that 
we must be prepared for and search for higher-dimensional operators in the effective 
SM that goes beyond the relevant and marginal operators of the SM. This is indeed 
happening at the LHC, and first indications of new physics may very well come from 


? There can be circumstances where a theory is observationally consistent in a vast number of 
observables, but in a few it does not get right, yet no other decent theory is around to replace it. 
In other words, observational consistency is still the top criterion, but the best theory may not be 
100 % consistent. 
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small perturbations in SM observables due to the subtle effects of these suppressed 
operators. 

However, there is broader point to be made regarding implications for colliders. 
In the years since the charm quark was discovered in the mid 1970s there has been 
tremendous progress experimentally and important new discoveries, including the 
recent discovery of a Higgs boson-like state (Aad et al. 2012; Chatrchyan et al. 2012), 
but no dramatic new discovery that can put us on a straight and narrow path beyond 
the SM. That may change soon at the LHC. Nevertheless, it is expensive in time and 
money to build higher energy colliders, our main reliable transporter into the high 
energy frontier. This limits the prospects for fast experimental progress. 

In the meantime though, hundreds of theories have been born and have died. Some 
have died due to incompatibility of new data (e.g., simplistic technicolor theories, 
or simpleminded no-scale supersymmetry theories), but others have died under their 
own self-consistency problems (e.g., some extra-dimensional models, some string 
phenomenology models, etc.). In both cases, it was care in establishing consistency 
with past data and mathematical rigor that have doomed them. In that sense, progress 
is made. Models come to the fore and fall under the spotlight or survive. When 
attempting to really explain everything, the consistency issues are stretched to the 
maximum. For example, it is not fully appreciated in the supersymmetry community 
that it may even be difficult to find a “natural” supersymmetric model that has a 
high enough reheat temperature to enable baryogenesis without causing problems 
elsewhere (Olechowski et al. 2009; Covi et al. 2011). There are many examples of 
ideas falling apart when they are pushed very hard to stand up to the full body of 
evidence of what we already know. 

Relatively speaking, theoretical research is inexpensive. It is natural that a shift 
develop in fundamental science. The code of values in theoretical research will likely 
alter in time, as experimental input slows. Ideas will be pursued more rigorously and 
analysed critically. Great ideas will always be welcome. However, soft model build- 
ing tweaks for simplicity and naturalness will become less valuable than rigorous 
tests of mathematical consistency. Distant future experimental implications identified 
for theories not fully vetted will become less valuable than rigorous computations of 
observational consistency across the board of all currently known data. One can hope 
that unsparing devotion to full consistency, both observational and mathematical, will 
be the hallmarks of the future era. 
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